From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Ingo Molnar <mingo@elte.hu>, Thomas Gleixner <tglx@linutronix.de>,
Rusty Russell <rusty@rustcorp.com.au>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Getting WARN_ON in hres_timers_resume after Xen resume
Date: Tue, 20 May 2008 15:54:39 +0100 [thread overview]
Message-ID: <4832E62F.9030901@goop.org> (raw)
I'm implementing suspend/resume for Xen at the moment. It's all going
well, but I'm getting this WARN_ON:
------------[ cut here ]------------
WARNING: at /home/jeremy/hg/xen/paravirt/linux/kernel/hrtimer.c:635 hres_timers_resume+0x33/0x56()
Modules linked in:
Pid: 1397, comm: kstopmachine Tainted: G W 2.6.26-rc2-sched-devel.git #94
[<c102e87d>] warn_on_slowpath+0x41/0x5d
[<c10477a1>] ? clockevents_program_event+0x105/0x10d
[<c1047dd3>] ? tick_resume+0x5c/0x61
[<c100145d>] ? xen_restore_fl+0x2e/0x52
[<c100145d>] ? xen_restore_fl+0x2e/0x52
[<c104b8da>] ? trace_hardirqs_off+0xb/0xd
[<c139b67e>] ? _spin_unlock_irqrestore+0x56/0x6c
[<c1047dd3>] ? tick_resume+0x5c/0x61
[<c1047e2d>] ? tick_notify+0x55/0x60
[<c139db0a>] ? notifier_call_chain+0x32/0x64
[<c1047960>] ? clockevents_notify+0x42/0x46
[<c100145d>] ? xen_restore_fl+0x2e/0x52
[<c104cc50>] ? lock_release+0x71/0x77
[<c1047960>] ? clockevents_notify+0x42/0x46
[<c1042192>] hres_timers_resume+0x33/0x56
[<c1045255>] timekeeping_resume+0x14e/0x157
[<c11b6ecc>] __sysdev_resume+0x14/0x38
[<c11b7091>] sysdev_resume+0x36/0x69
[<c11ba59e>] device_power_up+0x8/0xf
[<c1183476>] xen_suspend+0x9a/0xb2
[<c105fd3d>] do_stop+0x17/0x61
[<c105fd26>] ? do_stop+0x0/0x61
[<c103f806>] kthread+0x37/0x59
[<c103f7cf>] ? kthread+0x0/0x59
[<c100782b>] kernel_thread_helper+0x7/0x10
The WARN_ON is correct, because I do have other CPUs online. However,
I'm in the middle of stop_machine, so they're effectively off-line as
far as the rest of the system is concerned. (Xen suspend doesn't require
all the CPUs to be offlined, and not doing so makes things a fair bit
faster and cleaner.)
It seems to me that either:
1. stop_machine is enough like offlining that we can remove stopped
cpus from the online map, or
2. the check in hres_timers_resume is too strong, and can be either
weakened or removed, or
3. hres_timers_resume needn't be called here at all, or
4. I'm missing something, and I'm introducing a bug
BTW, once everything is out of stop_machine, I call clock_was_set() to
make sure that timers are retriggered on all CPUs.
Thoughts?
Thanks,
J
reply other threads:[~2008-05-20 14:55 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4832E62F.9030901@goop.org \
--to=jeremy@goop.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=rusty@rustcorp.com.au \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.