public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Marc Dionne <marc.c.dionne@gmail.com>
Cc: linux-kernel@vger.kernel.org,
	Thomas Gleixner <tglx@linutronix.de>,
	Xiaotian Feng <dfeng@redhat.com>, Ingo Molnar <mingo@elte.hu>
Subject: Re: BUG during shutdown - bisected to commit e2912009
Date: Sat, 02 Jan 2010 01:42:00 +0100	[thread overview]
Message-ID: <1262392920.32223.10.camel@laptop> (raw)
In-Reply-To: <6041d2001001011627o5c494df4v37c0c466df3d444c@mail.gmail.com>

On Fri, 2010-01-01 at 19:27 -0500, Marc Dionne wrote:
> I'm getting a BUG with current kernels from
> kernel/time/clockevents.c:263 when halting the system - a restart
> behaves normally.  I don't have a good camera handy at the moment to
> capture the call stack on screen, but the call sequence is:
> 
> clockevents_notify
> hrtimer_cpu_notify
> notifier_call_chain
> raw_notifier_call_chain
> _cpu_down
> disable_nonboot_cpus
> kernel_power_off
> sys_reboot
> 
> I bisected it down to commit e2912009: sched: Ensure set_task_cpu() is
> never called on blocked tasks.  There were a few commits tested along
> the way where I got a freeze (with the power still on) instead of a
> BUG. Reverting that commit from the current kernel doesn't look
> trivial, but the commit immediately preceding this one does halt fine.

We somehow seem to trip up the below patch, which doesn't really make
sense, as I can't find how task placement would affect the below error.

It seems to purely test against the hot-unplugged cpu, not a cpu the
task is running on.

---
commit bb6eddf7676e1c1f3e637aa93c5224488d99036f
Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Thu Dec 10 15:35:10 2009 +0100

    clockevents: Prevent clockevent_devices list corruption on cpu hotplug
    
    Xiaotian Feng triggered a list corruption in the clock events list on
    CPU hotplug and debugged the root cause.
    
    If a CPU registers more than one per cpu clock event device, then only
    the active clock event device is removed on CPU_DEAD. The unused
    devices are kept in the clock events device list.
    
    On CPU up the clock event devices are registered again, which means
    that we list_add an already enqueued list_head. That results in list
    corruption.
    
    Resolve this by removing all devices which are associated to the dead
    CPU on CPU_DEAD.
    
    Reported-by: Xiaotian Feng <dfeng@redhat.com>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Tested-by: Xiaotian Feng <dfeng@redhat.com>
    Cc: stable@kernel.org

diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
index 20a8920..91db2e3 100644
--- a/kernel/time/clockevents.c
+++ b/kernel/time/clockevents.c
@@ -238,8 +238,9 @@ void clockevents_exchange_device(struct clock_event_device *old,
  */
 void clockevents_notify(unsigned long reason, void *arg)
 {
-	struct list_head *node, *tmp;
+	struct clock_event_device *dev, *tmp;
 	unsigned long flags;
+	int cpu;
 
 	spin_lock_irqsave(&clockevents_lock, flags);
 	clockevents_do_notify(reason, arg);
@@ -250,8 +251,19 @@ void clockevents_notify(unsigned long reason, void *arg)
 		 * Unregister the clock event devices which were
 		 * released from the users in the notify chain.
 		 */
-		list_for_each_safe(node, tmp, &clockevents_released)
-			list_del(node);
+		list_for_each_entry_safe(dev, tmp, &clockevents_released, list)
+			list_del(&dev->list);
+		/*
+		 * Now check whether the CPU has left unused per cpu devices
+		 */
+		cpu = *((int *)arg);
+		list_for_each_entry_safe(dev, tmp, &clockevent_devices, list) {
+			if (cpumask_test_cpu(cpu, dev->cpumask) &&
+			    cpumask_weight(dev->cpumask) == 1) {
+				BUG_ON(dev->mode != CLOCK_EVT_MODE_UNUSED);
+				list_del(&dev->list);
+			}
+		}
 		break;
 	default:
 		break;



  reply	other threads:[~2010-01-02  0:42 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-01-02  0:27 BUG during shutdown - bisected to commit e2912009 Marc Dionne
2010-01-02  0:42 ` Peter Zijlstra [this message]
2010-01-04 18:43   ` Marc Dionne
2010-01-05  2:56     ` Xiaotian Feng
2010-01-05  3:23       ` Marc Dionne
2010-01-05 10:18         ` Xiaotian Feng
2010-01-05 22:58           ` Marc Dionne
2010-01-06  9:42             ` Xiaotian Feng
2010-01-07  0:44               ` Marc Dionne
2010-01-07  2:51                 ` Xiaotian Feng
2010-01-07  3:07                   ` Marc Dionne
2010-01-07  3:20                     ` Marc Dionne
2010-01-07  3:24                       ` Xiaotian Feng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1262392920.32223.10.camel@laptop \
    --to=peterz@infradead.org \
    --cc=dfeng@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marc.c.dionne@gmail.com \
    --cc=mingo@elte.hu \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox