Getting WARN_ON in hres_timers_resume after Xen resume

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Ingo Molnar <mingo@elte.hu>, Thomas Gleixner <tglx@linutronix.de>,
	Rusty Russell <rusty@rustcorp.com.au>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Getting WARN_ON in hres_timers_resume after Xen resume
Date: Tue, 20 May 2008 15:54:39 +0100	[thread overview]
Message-ID: <4832E62F.9030901@goop.org> (raw)

I'm implementing suspend/resume for Xen at the moment.  It's all going 
well, but I'm getting this WARN_ON:

------------[ cut here ]------------
WARNING: at /home/jeremy/hg/xen/paravirt/linux/kernel/hrtimer.c:635 hres_timers_resume+0x33/0x56()
Modules linked in:
Pid: 1397, comm: kstopmachine Tainted: G        W 2.6.26-rc2-sched-devel.git #94
 [<c102e87d>] warn_on_slowpath+0x41/0x5d
 [<c10477a1>] ? clockevents_program_event+0x105/0x10d
 [<c1047dd3>] ? tick_resume+0x5c/0x61
 [<c100145d>] ? xen_restore_fl+0x2e/0x52
 [<c100145d>] ? xen_restore_fl+0x2e/0x52
 [<c104b8da>] ? trace_hardirqs_off+0xb/0xd
 [<c139b67e>] ? _spin_unlock_irqrestore+0x56/0x6c
 [<c1047dd3>] ? tick_resume+0x5c/0x61
 [<c1047e2d>] ? tick_notify+0x55/0x60
 [<c139db0a>] ? notifier_call_chain+0x32/0x64
 [<c1047960>] ? clockevents_notify+0x42/0x46
 [<c100145d>] ? xen_restore_fl+0x2e/0x52
 [<c104cc50>] ? lock_release+0x71/0x77
 [<c1047960>] ? clockevents_notify+0x42/0x46
 [<c1042192>] hres_timers_resume+0x33/0x56
 [<c1045255>] timekeeping_resume+0x14e/0x157
 [<c11b6ecc>] __sysdev_resume+0x14/0x38
 [<c11b7091>] sysdev_resume+0x36/0x69
 [<c11ba59e>] device_power_up+0x8/0xf
 [<c1183476>] xen_suspend+0x9a/0xb2
 [<c105fd3d>] do_stop+0x17/0x61
 [<c105fd26>] ? do_stop+0x0/0x61
 [<c103f806>] kthread+0x37/0x59
 [<c103f7cf>] ? kthread+0x0/0x59
 [<c100782b>] kernel_thread_helper+0x7/0x10

The WARN_ON is correct, because I do have other CPUs online.  However, 
I'm in the middle of stop_machine, so they're effectively off-line as 
far as the rest of the system is concerned. (Xen suspend doesn't require 
all the CPUs to be offlined, and not doing so makes things a fair bit 
faster and cleaner.)

It seems to me that either:

   1. stop_machine is enough like offlining that we can remove stopped
      cpus from the online map, or
   2. the check in hres_timers_resume is too strong, and can be either
      weakened or removed, or
   3. hres_timers_resume needn't be called here at all, or
   4. I'm missing something, and I'm introducing a bug

BTW, once everything is out of stop_machine, I call clock_was_set() to 
make sure that timers are retriggered on all CPUs.

Thoughts?

Thanks,
    J

                 reply	other threads:[~2008-05-20 14:55 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4832E62F.9030901@goop.org \
    --to=jeremy@goop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=rusty@rustcorp.com.au \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.