* [PATCH] softlockup detection vs. cpu hotplug
@ 2006-02-24 0:31 Nathan Lynch
2006-02-24 1:31 ` Shaohua Li
2006-02-24 6:25 ` Ingo Molnar
0 siblings, 2 replies; 3+ messages in thread
From: Nathan Lynch @ 2006-02-24 0:31 UTC (permalink / raw)
To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton
I'm able to trigger bogus softlockup warnings during cpu hotplug
testing, due to the percpu timestamp not being re-initialized before
the cpu starts servicing timer interrupts.
Before starting a cpu's watchdog thread, touch the cpu's timestamp to
avoid false positives.
In the watchdog thread, do touch_softlockup_watchdog in a
non-preemptible section so that it won't touch another cpu's
timestamp. This can happen in the window between the watchdog thread
getting forcefully migrated during a cpu offline operation and
kthread_should_stop.
Signed-off-by: Nathan Lynch <ntl@pobox.com>
---
kernel/softlockup.c | 20 ++++++++++++++++----
1 files changed, 16 insertions(+), 4 deletions(-)
33426e0c29ad973f9107cbd872648050f8988e61
diff --git a/kernel/softlockup.c b/kernel/softlockup.c
index c67189a..bd86fe1 100644
--- a/kernel/softlockup.c
+++ b/kernel/softlockup.c
@@ -34,9 +34,14 @@ static struct notifier_block panic_block
.notifier_call = softlock_panic,
};
+static void __touch_softlockup_watchdog(int cpu)
+{
+ per_cpu(timestamp, cpu) = jiffies;
+}
+
void touch_softlockup_watchdog(void)
{
- per_cpu(timestamp, raw_smp_processor_id()) = jiffies;
+ __touch_softlockup_watchdog(raw_smp_processor_id());
}
EXPORT_SYMBOL(touch_softlockup_watchdog);
@@ -73,6 +78,7 @@ void softlockup_tick(struct pt_regs *reg
static int watchdog(void * __bind_cpu)
{
struct sched_param param = { .sched_priority = 99 };
+ unsigned long bind_cpu = (unsigned long)__bind_cpu;
sched_setscheduler(current, SCHED_FIFO, ¶m);
current->flags |= PF_NOFREEZE;
@@ -86,10 +92,15 @@ static int watchdog(void * __bind_cpu)
*/
while (!kthread_should_stop()) {
msleep_interruptible(1000);
- touch_softlockup_watchdog();
+ /* When our cpu is offlined the watchdog thread can
+ * get migrated before it is stopped.
+ */
+ preempt_disable();
+ if (likely(smp_processor_id() == bind_cpu))
+ touch_softlockup_watchdog();
+ preempt_enable();
+ __set_current_state(TASK_RUNNING);
}
- __set_current_state(TASK_RUNNING);
-
return 0;
}
@@ -112,6 +123,7 @@ cpu_callback(struct notifier_block *nfb,
}
per_cpu(watchdog_task, hotcpu) = p;
kthread_bind(p, hotcpu);
+ __touch_softlockup_watchdog(hotcpu);
break;
case CPU_ONLINE:
--
1.1.5
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] softlockup detection vs. cpu hotplug
2006-02-24 0:31 [PATCH] softlockup detection vs. cpu hotplug Nathan Lynch
@ 2006-02-24 1:31 ` Shaohua Li
2006-02-24 6:25 ` Ingo Molnar
1 sibling, 0 replies; 3+ messages in thread
From: Shaohua Li @ 2006-02-24 1:31 UTC (permalink / raw)
To: Nathan Lynch; +Cc: linux-kernel, Ingo Molnar, Andrew Morton
On Fri, 2006-02-24 at 08:31 +0800, Nathan Lynch wrote:
>
> In the watchdog thread, do touch_softlockup_watchdog in a
> non-preemptible section so that it won't touch another cpu's
> timestamp. This can happen in the window between the watchdog thread
> getting forcefully migrated during a cpu offline operation and
> kthread_should_stop.
Could we stop the thread in CPU_DOWN_PREPARE case, so it will not be
migrated to other CPUs? I suppose it's better the per-cpu thread only
runs on the specific cpu.
Thanks,
Shaohua
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] softlockup detection vs. cpu hotplug
2006-02-24 0:31 [PATCH] softlockup detection vs. cpu hotplug Nathan Lynch
2006-02-24 1:31 ` Shaohua Li
@ 2006-02-24 6:25 ` Ingo Molnar
1 sibling, 0 replies; 3+ messages in thread
From: Ingo Molnar @ 2006-02-24 6:25 UTC (permalink / raw)
To: Nathan Lynch; +Cc: linux-kernel, Andrew Morton
* Nathan Lynch <ntl@pobox.com> wrote:
> @@ -86,10 +92,15 @@ static int watchdog(void * __bind_cpu)
> */
> while (!kthread_should_stop()) {
> msleep_interruptible(1000);
> - touch_softlockup_watchdog();
> + /* When our cpu is offlined the watchdog thread can
> + * get migrated before it is stopped.
> + */
> + preempt_disable();
> + if (likely(smp_processor_id() == bind_cpu))
> + touch_softlockup_watchdog();
> + preempt_enable();
> + __set_current_state(TASK_RUNNING);
> }
> - __set_current_state(TASK_RUNNING);
> -
> return 0;
> }
>
the above change is unnecessary: there is absolutely no harm from a
migrated watchdog thread touching another CPU's timestamp for a short
amount of time. [Furtermore, why doesnt the hotplug CPU code use the
kthread_should_stop() mechanism to gracefully stop per-CPU threads,
instead of migrating unsuspecting threads and putting ugly hooks into
every such thread?]
the other fix (to touch before bringing the CPU up) looks OK, but it's
simpler to initialize it via the oneliner below. Andrew, this is what
i'd suggest to do for v2.6.16. [i'll send an updated softirq-rework
patch in the next mail.]
Ingo
------
fix from Nathan Lynch: initialize the softlockup timestamp of freshly
brought up CPUs with jiffies.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
--- kernel/softlockup.c.orig
+++ kernel/softlockup.c
@@ -110,6 +110,7 @@ cpu_callback(struct notifier_block *nfb,
printk("watchdog for %i failed\n", hotcpu);
return NOTIFY_BAD;
}
+ per_cpu(timestamp, hotcpu) = jiffies;
per_cpu(watchdog_task, hotcpu) = p;
kthread_bind(p, hotcpu);
break;
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2006-02-24 6:27 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-02-24 0:31 [PATCH] softlockup detection vs. cpu hotplug Nathan Lynch
2006-02-24 1:31 ` Shaohua Li
2006-02-24 6:25 ` Ingo Molnar
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox