From: Peter Zijlstra <peterz@infradead.org>
To: Marc Dionne <marc.c.dionne@gmail.com>
Cc: linux-kernel@vger.kernel.org,
Thomas Gleixner <tglx@linutronix.de>,
Xiaotian Feng <dfeng@redhat.com>, Ingo Molnar <mingo@elte.hu>
Subject: Re: BUG during shutdown - bisected to commit e2912009
Date: Sat, 02 Jan 2010 01:42:00 +0100 [thread overview]
Message-ID: <1262392920.32223.10.camel@laptop> (raw)
In-Reply-To: <6041d2001001011627o5c494df4v37c0c466df3d444c@mail.gmail.com>
On Fri, 2010-01-01 at 19:27 -0500, Marc Dionne wrote:
> I'm getting a BUG with current kernels from
> kernel/time/clockevents.c:263 when halting the system - a restart
> behaves normally. I don't have a good camera handy at the moment to
> capture the call stack on screen, but the call sequence is:
>
> clockevents_notify
> hrtimer_cpu_notify
> notifier_call_chain
> raw_notifier_call_chain
> _cpu_down
> disable_nonboot_cpus
> kernel_power_off
> sys_reboot
>
> I bisected it down to commit e2912009: sched: Ensure set_task_cpu() is
> never called on blocked tasks. There were a few commits tested along
> the way where I got a freeze (with the power still on) instead of a
> BUG. Reverting that commit from the current kernel doesn't look
> trivial, but the commit immediately preceding this one does halt fine.
We somehow seem to trip up the below patch, which doesn't really make
sense, as I can't find how task placement would affect the below error.
It seems to purely test against the hot-unplugged cpu, not a cpu the
task is running on.
---
commit bb6eddf7676e1c1f3e637aa93c5224488d99036f
Author: Thomas Gleixner <tglx@linutronix.de>
Date: Thu Dec 10 15:35:10 2009 +0100
clockevents: Prevent clockevent_devices list corruption on cpu hotplug
Xiaotian Feng triggered a list corruption in the clock events list on
CPU hotplug and debugged the root cause.
If a CPU registers more than one per cpu clock event device, then only
the active clock event device is removed on CPU_DEAD. The unused
devices are kept in the clock events device list.
On CPU up the clock event devices are registered again, which means
that we list_add an already enqueued list_head. That results in list
corruption.
Resolve this by removing all devices which are associated to the dead
CPU on CPU_DEAD.
Reported-by: Xiaotian Feng <dfeng@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Xiaotian Feng <dfeng@redhat.com>
Cc: stable@kernel.org
diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
index 20a8920..91db2e3 100644
--- a/kernel/time/clockevents.c
+++ b/kernel/time/clockevents.c
@@ -238,8 +238,9 @@ void clockevents_exchange_device(struct clock_event_device *old,
*/
void clockevents_notify(unsigned long reason, void *arg)
{
- struct list_head *node, *tmp;
+ struct clock_event_device *dev, *tmp;
unsigned long flags;
+ int cpu;
spin_lock_irqsave(&clockevents_lock, flags);
clockevents_do_notify(reason, arg);
@@ -250,8 +251,19 @@ void clockevents_notify(unsigned long reason, void *arg)
* Unregister the clock event devices which were
* released from the users in the notify chain.
*/
- list_for_each_safe(node, tmp, &clockevents_released)
- list_del(node);
+ list_for_each_entry_safe(dev, tmp, &clockevents_released, list)
+ list_del(&dev->list);
+ /*
+ * Now check whether the CPU has left unused per cpu devices
+ */
+ cpu = *((int *)arg);
+ list_for_each_entry_safe(dev, tmp, &clockevent_devices, list) {
+ if (cpumask_test_cpu(cpu, dev->cpumask) &&
+ cpumask_weight(dev->cpumask) == 1) {
+ BUG_ON(dev->mode != CLOCK_EVT_MODE_UNUSED);
+ list_del(&dev->list);
+ }
+ }
break;
default:
break;
next prev parent reply other threads:[~2010-01-02 0:42 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-01-02 0:27 BUG during shutdown - bisected to commit e2912009 Marc Dionne
2010-01-02 0:42 ` Peter Zijlstra [this message]
2010-01-04 18:43 ` Marc Dionne
2010-01-05 2:56 ` Xiaotian Feng
2010-01-05 3:23 ` Marc Dionne
2010-01-05 10:18 ` Xiaotian Feng
2010-01-05 22:58 ` Marc Dionne
2010-01-06 9:42 ` Xiaotian Feng
2010-01-07 0:44 ` Marc Dionne
2010-01-07 2:51 ` Xiaotian Feng
2010-01-07 3:07 ` Marc Dionne
2010-01-07 3:20 ` Marc Dionne
2010-01-07 3:24 ` Xiaotian Feng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1262392920.32223.10.camel@laptop \
--to=peterz@infradead.org \
--cc=dfeng@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=marc.c.dionne@gmail.com \
--cc=mingo@elte.hu \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox