* [patch 0/4] Revised softlockup watchdog improvement patches
@ 2007-03-27 21:49 Jeremy Fitzhardinge
2007-03-27 21:49 ` [patch 1/4] Ignore stolen time in the softlockup watchdog Jeremy Fitzhardinge
` (3 more replies)
0 siblings, 4 replies; 33+ messages in thread
From: Jeremy Fitzhardinge @ 2007-03-27 21:49 UTC (permalink / raw)
To: Ingo Molnar
Cc: Prarit Bhargava, virtualization, Andrew Morton, Linux Kernel,
Eric Dumazet
Hi Ingo,
This series of patches implements a number of improvements to the
softlockup watchdog and its users.
They are:
1. Make the watchdog ignore stolen time
When running under a hypervisor, the kernel may lose an arbitrary amount
of time as "stolen time". This may cause the softlockup watchdog to
trigger spruiously. Xen and VMI implement sched_clock() as measuring
unstolen time, so use that as the timebase for the softlockup watchdog.
This also removes a dependency on jiffies.
2. Add a per-cpu enable flag for the watchdog
When the scheduler disables ticks for a specific CPU, this allows the
watchdog to be disabled as well. This avoids spurious watchdog errors
if a CPU has been tickless for a long time. The existing tick-sched.c
code tried to address this by periodically touching the watchdog, but
this seems more direct.
3. Use the per-cpu enable flags to temporarily disable the watchdog
Some drivers perform long operations which will cause the watchdog to
time out. If we know we're about to start such an operation, disable
the watchdog timer for the interim. This is more straightforward than
trying to periodically tickle the watchdog timer during the operation, and
safer than just doing it once at the start and hoping there's enough time.
4. Add a global disable flag
Sometimes a long operation will affect all CPUs' ability to tickle the
watchdog timer. An obvious example is suspend/resume, but it can also
happen while generating lots of sysrq output, or other specialized
operations. Add global enable/disable calls to deal with these case.
--
^ permalink raw reply [flat|nested] 33+ messages in thread
* [patch 1/4] Ignore stolen time in the softlockup watchdog
2007-03-27 21:49 [patch 0/4] Revised softlockup watchdog improvement patches Jeremy Fitzhardinge
@ 2007-03-27 21:49 ` Jeremy Fitzhardinge
2007-04-24 6:49 ` Andrew Morton
2007-03-27 21:49 ` [patch 2/4] percpu enable flag for " Jeremy Fitzhardinge
` (2 subsequent siblings)
3 siblings, 1 reply; 33+ messages in thread
From: Jeremy Fitzhardinge @ 2007-03-27 21:49 UTC (permalink / raw)
To: Ingo Molnar
Cc: Prarit Bhargava, Rick Lindsley, john stultz, Eric Dumazet,
virtualization, Paul Mackerras, Martin Schwidefsky, Andrew Morton,
Thomas Gleixner, Linux Kernel
[-- Attachment #1: softlockup-ignore-stolen-time.patch --]
[-- Type: text/plain, Size: 4125 bytes --]
The softlockup watchdog is currently a nuisance in a virtual machine,
since the whole system could have the CPU stolen from it for a long
period of time. While it would be unlikely for a guest domain to be
denied timer interrupts for over 10s, it could happen and any softlockup
message would be completely spurious.
Earlier I proposed that sched_clock() return time in unstolen
nanoseconds, which is how Xen and VMI currently implement it. If the
softlockup watchdog uses sched_clock() to measure time, it would
automatically ignore stolen time, and therefore only report when the
guest itself locked up. When running native, sched_clock() returns
real-time nanoseconds, so the behaviour would be unchanged.
Note that sched_clock() used this way is inherently per-cpu, so this
patch makes sure that the per-processor watchdog thread initialized
its own timestamp.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Zachary Amsden <zach@vmware.com>
Cc: James Morris <jmorris@namei.org>
Cc: Dan Hecht <dhecht@vmware.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Chris Lalancette <clalance@redhat.com>
Cc: Rick Lindsley <ricklind@us.ibm.com>
Cc: Eric Dumazet <dada1@cosmosbay.com>
---
kernel/softlockup.c | 37 ++++++++++++++++++++++++++++++-------
1 file changed, 30 insertions(+), 7 deletions(-)
===================================================================
--- a/kernel/softlockup.c
+++ b/kernel/softlockup.c
@@ -35,9 +35,19 @@ static struct notifier_block panic_block
.notifier_call = softlock_panic,
};
+/*
+ * Returns seconds, approximately. We don't need nanosecond
+ * resolution, and we don't need to waste time with a big divide when
+ * 2^30ns == 1.074s.
+ */
+static unsigned long get_timestamp(void)
+{
+ return sched_clock() >> 30; /* 2^30 ~= 10^9 */
+}
+
void touch_softlockup_watchdog(void)
{
- __raw_get_cpu_var(touch_timestamp) = jiffies;
+ __raw_get_cpu_var(touch_timestamp) = get_timestamp();
}
EXPORT_SYMBOL(touch_softlockup_watchdog);
@@ -48,10 +58,18 @@ void softlockup_tick(void)
void softlockup_tick(void)
{
int this_cpu = smp_processor_id();
- unsigned long touch_timestamp = per_cpu(touch_timestamp, this_cpu);
+ unsigned long touch_timestamp = __get_cpu_var(touch_timestamp);
+ unsigned long print_timestamp;
+ unsigned long now;
- /* prevent double reports: */
- if (per_cpu(print_timestamp, this_cpu) == touch_timestamp ||
+ /* watchdog task hasn't updated timestamp yet */
+ if (touch_timestamp == 0)
+ return;
+
+ print_timestamp = __get_cpu_var(print_timestamp);
+
+ /* report at most once a second */
+ if (print_timestamp < (touch_timestamp + 1) ||
did_panic ||
!per_cpu(watchdog_task, this_cpu))
return;
@@ -62,12 +80,14 @@ void softlockup_tick(void)
return;
}
+ now = get_timestamp();
+
/* Wake up the high-prio watchdog task every second: */
- if (time_after(jiffies, touch_timestamp + HZ))
+ if (now > (touch_timestamp + 1))
wake_up_process(per_cpu(watchdog_task, this_cpu));
/* Warn about unreasonable 10+ seconds delays: */
- if (time_after(jiffies, touch_timestamp + 10*HZ)) {
+ if (now > (touch_timestamp + 10)) {
per_cpu(print_timestamp, this_cpu) = touch_timestamp;
spin_lock(&print_lock);
@@ -87,6 +107,9 @@ static int watchdog(void * __bind_cpu)
sched_setscheduler(current, SCHED_FIFO, ¶m);
current->flags |= PF_NOFREEZE;
+
+ /* initialize timestamp */
+ touch_softlockup_watchdog();
/*
* Run briefly once per second to reset the softlockup timestamp.
@@ -120,7 +143,7 @@ cpu_callback(struct notifier_block *nfb,
printk("watchdog for %i failed\n", hotcpu);
return NOTIFY_BAD;
}
- per_cpu(touch_timestamp, hotcpu) = jiffies;
+ per_cpu(touch_timestamp, hotcpu) = 0;
per_cpu(watchdog_task, hotcpu) = p;
kthread_bind(p, hotcpu);
break;
--
^ permalink raw reply [flat|nested] 33+ messages in thread
* [patch 2/4] percpu enable flag for softlockup watchdog
2007-03-27 21:49 [patch 0/4] Revised softlockup watchdog improvement patches Jeremy Fitzhardinge
2007-03-27 21:49 ` [patch 1/4] Ignore stolen time in the softlockup watchdog Jeremy Fitzhardinge
@ 2007-03-27 21:49 ` Jeremy Fitzhardinge
2007-03-27 21:49 ` [patch 3/4] Locally disable the softlockup watchdog rather than touching it Jeremy Fitzhardinge
2007-03-27 21:49 ` [patch 4/4] Add global disable/enable for softlockup watchdog Jeremy Fitzhardinge
3 siblings, 0 replies; 33+ messages in thread
From: Jeremy Fitzhardinge @ 2007-03-27 21:49 UTC (permalink / raw)
To: Ingo Molnar
Cc: Prarit Bhargava, john stultz, Eric Dumazet, virtualization,
Paul Mackerras, Martin Schwidefsky, Andrew Morton,
Thomas Gleixner, Linux Kernel
[-- Attachment #1: softlockup-percpu-enable-flag.patch --]
[-- Type: text/plain, Size: 8065 bytes --]
On a NO_HZ system, there may be an arbitrarily long delay between
ticks on a CPU. When we're disabling ticks for a CPU, also disable
the softlockup watchdog timer.
This makes the touch_softlockup_watchdog() interface redundant; if a
piece of code knows its going to be holding off timer interrupts long
enough to trigger the watchdog, then it may as well simply temporarily
disable the timer.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Zachary Amsden <zach@vmware.com>
Cc: James Morris <jmorris@namei.org>
Cc: Dan Hecht <dhecht@vmware.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Chris Lalancette <clalance@redhat.com>
Cc: Eric Dumazet <dada1@cosmosbay.com>
---
include/linux/sched.h | 25 +++++++++++++++++
kernel/softlockup.c | 67 ++++++++++++++++++++++++++++++++++++++++++----
kernel/time/tick-sched.c | 34 ++++++++++-------------
3 files changed, 102 insertions(+), 24 deletions(-)
===================================================================
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -232,12 +232,37 @@ extern void scheduler_tick(void);
extern void scheduler_tick(void);
#ifdef CONFIG_DETECT_SOFTLOCKUP
+/* A scheduler tick on this CPU */
extern void softlockup_tick(void);
+
+/* Scheduler is disabling/re-enabling ticks on this CPU */
+extern void softlockup_tick_disable(void);
+extern void softlockup_tick_enable(void);
+
+/* Some code wants to temporarily disable the watchdog */
+extern int softlockup_disable(void);
+extern void softlockup_enable(int state);
+
extern void spawn_softlockup_task(void);
extern void touch_softlockup_watchdog(void);
#else
static inline void softlockup_tick(void)
{
+}
+static inline void softlockup_tick_disable(void)
+{
+}
+static inline void softlockup_tick_enable(void)
+{
+}
+static inline int softlockup_disable(void)
+{
+ preempt_disable();
+ return 0;
+}
+static inline void softlockup_enable(int state)
+{
+ preempt_enable();
}
static inline void spawn_softlockup_task(void)
{
===================================================================
--- a/kernel/softlockup.c
+++ b/kernel/softlockup.c
@@ -20,6 +20,7 @@ static DEFINE_PER_CPU(unsigned long, tou
static DEFINE_PER_CPU(unsigned long, touch_timestamp);
static DEFINE_PER_CPU(unsigned long, print_timestamp);
static DEFINE_PER_CPU(struct task_struct *, watchdog_task);
+static DEFINE_PER_CPU(int, enabled);
static int did_panic = 0;
@@ -45,11 +46,65 @@ static unsigned long get_timestamp(void)
return sched_clock() >> 30; /* 2^30 ~= 10^9 */
}
-void touch_softlockup_watchdog(void)
+void inline touch_softlockup_watchdog(void)
{
__raw_get_cpu_var(touch_timestamp) = get_timestamp();
}
EXPORT_SYMBOL(touch_softlockup_watchdog);
+
+/*
+ * Disable the watchdog on this CPU. This is called directly by the
+ * scheduler to tell us it's going tickless.
+ */
+void inline softlockup_tick_disable(void)
+{
+ __get_cpu_var(enabled) = 0;
+}
+
+/*
+ * Disable the watchdog for this CPU, returning the current state to
+ * allow nesting. Returns with preemptio disabled, since we can't
+ * switch CPUs before we re-enable the watchdog (also, if we're
+ * worried about getting watchdog timeouts, we're not scheduling).
+ */
+int softlockup_disable(void)
+{
+ int ret;
+
+ preempt_disable();
+
+ ret = __get_cpu_var(enabled);
+ softlockup_tick_disable();
+
+ return ret;
+}
+EXPORT_SYMBOL(softlockup_disable);
+
+/*
+ * Re-enable the watchdog on this CPU. Called directly by the
+ * scheduler to tell us ticks are resuming.
+ */
+void inline softlockup_tick_enable(void)
+{
+ __get_cpu_var(enabled) = 1;
+}
+
+/*
+ * Returns softlockup watchdog state to before the last per-cpu
+ * disable.
+ */
+void softlockup_enable(int state)
+{
+ if (state) {
+ touch_softlockup_watchdog();
+ /* update timestamp before enable */
+ barrier();
+ softlockup_tick_enable();
+ }
+
+ preempt_enable();
+}
+EXPORT_SYMBOL(softlockup_enable);
/*
* This callback runs from the timer interrupt, and checks
@@ -62,8 +117,8 @@ void softlockup_tick(void)
unsigned long print_timestamp;
unsigned long now;
- /* watchdog task hasn't updated timestamp yet */
- if (touch_timestamp == 0)
+ /* return if not enabled */
+ if (!__get_cpu_var(enabled))
return;
print_timestamp = __get_cpu_var(print_timestamp);
@@ -108,8 +163,8 @@ static int watchdog(void * __bind_cpu)
sched_setscheduler(current, SCHED_FIFO, ¶m);
current->flags |= PF_NOFREEZE;
- /* initialize timestamp */
- touch_softlockup_watchdog();
+ /* enable on this cpu */
+ softlockup_tick_enable();
/*
* Run briefly once per second to reset the softlockup timestamp.
@@ -122,6 +177,8 @@ static int watchdog(void * __bind_cpu)
touch_softlockup_watchdog();
schedule();
}
+
+ softlockup_tick_disable();
return 0;
}
===================================================================
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -228,6 +228,8 @@ void tick_nohz_stop_sched_tick(void)
ts->idle_tick = ts->sched_timer.expires;
ts->tick_stopped = 1;
ts->idle_jiffies = last_jiffies;
+
+ softlockup_tick_disable();
}
/*
* calculate the expiry time for the next timer wheel
@@ -255,6 +257,7 @@ void tick_nohz_stop_sched_tick(void)
cpu_clear(cpu, nohz_cpu_mask);
}
raise_softirq_irqoff(TIMER_SOFTIRQ);
+
out:
ts->next_jiffies = next_jiffies;
ts->last_jiffies = last_jiffies;
@@ -311,6 +314,8 @@ void tick_nohz_restart_sched_tick(void)
ts->tick_stopped = 0;
hrtimer_cancel(&ts->sched_timer);
ts->sched_timer.expires = ts->idle_tick;
+
+ softlockup_tick_enable();
while (1) {
/* Forward the time to expire in the future */
@@ -355,17 +360,12 @@ static void tick_nohz_handler(struct clo
tick_do_update_jiffies64(now);
/*
- * When we are idle and the tick is stopped, we have to touch
- * the watchdog as we might not schedule for a really long
- * time. This happens on complete idle SMP systems while
- * waiting on the login prompt. We also increment the "start
- * of idle" jiffy stamp so the idle accounting adjustment we
- * do when we go busy again does not account too much ticks.
- */
- if (ts->tick_stopped) {
- touch_softlockup_watchdog();
+ * Increment the "start of idle" jiffy stamp so the idle
+ * accounting adjustment we do when we go busy again does not
+ * account too much ticks.
+ */
+ if (ts->tick_stopped)
ts->idle_jiffies++;
- }
update_process_times(user_mode(regs));
profile_tick(CPU_PROFILING);
@@ -450,17 +450,12 @@ static enum hrtimer_restart tick_sched_t
*/
if (regs) {
/*
- * When we are idle and the tick is stopped, we have to touch
- * the watchdog as we might not schedule for a really long
- * time. This happens on complete idle SMP systems while
- * waiting on the login prompt. We also increment the "start of
- * idle" jiffy stamp so the idle accounting adjustment we do
- * when we go busy again does not account too much ticks.
+ * Increment the "start of idle" jiffy stamp so the
+ * idle accounting adjustment we do when we go busy
+ * again does not account too much ticks.
*/
- if (ts->tick_stopped) {
- touch_softlockup_watchdog();
+ if (ts->tick_stopped)
ts->idle_jiffies++;
- }
/*
* update_process_times() might take tasklist_lock, hence
* drop the base lock. sched-tick hrtimers are per-CPU and
@@ -522,6 +517,7 @@ void tick_cancel_sched_timer(int cpu)
if (ts->sched_timer.base)
hrtimer_cancel(&ts->sched_timer);
ts->tick_stopped = 0;
+ softlockup_tick_enable();
ts->nohz_mode = NOHZ_MODE_INACTIVE;
}
#endif /* HIGH_RES_TIMERS */
--
^ permalink raw reply [flat|nested] 33+ messages in thread
* [patch 3/4] Locally disable the softlockup watchdog rather than touching it
2007-03-27 21:49 [patch 0/4] Revised softlockup watchdog improvement patches Jeremy Fitzhardinge
2007-03-27 21:49 ` [patch 1/4] Ignore stolen time in the softlockup watchdog Jeremy Fitzhardinge
2007-03-27 21:49 ` [patch 2/4] percpu enable flag for " Jeremy Fitzhardinge
@ 2007-03-27 21:49 ` Jeremy Fitzhardinge
2007-03-28 13:33 ` Prarit Bhargava
2007-03-27 21:49 ` [patch 4/4] Add global disable/enable for softlockup watchdog Jeremy Fitzhardinge
3 siblings, 1 reply; 33+ messages in thread
From: Jeremy Fitzhardinge @ 2007-03-27 21:49 UTC (permalink / raw)
To: Ingo Molnar
Cc: Prarit Bhargava, virtualization, Andrew Morton, John Hawkes,
Linux Kernel, Eric Dumazet
[-- Attachment #1: softlockup-disable-rather-than-touch.patch --]
[-- Type: text/plain, Size: 4996 bytes --]
If we're about to enter a prolonged period in which we know we're
going to hold off scheduler ticks, then disable the CPU's softlockup
watchdog for the duration. This avoids having to repeatedly touch the
timestamp, or conversely, risk having the watchdog timeout in the
middle of your long operation.
A question about each of these changes is whether they expect just the
current CPU to be locked up, or the whole machine.
touch_softlockup_watchdog updates the per-cpu timestamp, so I assumed
that its per-cpu for all these cases.
One semantic change this makes is that softlockup_disable/enable does
a preempt_disable/enable. I don't think this is a problem, since if
you're worried about triggering the softlockup watchdog, you're not
scheduling.
I haven't really worked out how this should interact with the nmi
watchdog; touch_nmi_watchdog() still ends up calling
touch_softlockup_watchdog(), so there's still some redundancy here.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Chris Lalancette <clalance@redhat.com>
Cc: Eric Dumazet <dada1@cosmosbay.com>
Cc: John Hawkes <hawkes@sgi.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
arch/ia64/kernel/uncached.c | 8 ++++++--
drivers/ide/ide-iops.c | 25 +++++++++++++++++++------
drivers/ide/ide-taskfile.c | 8 +++++++-
drivers/mtd/nand/nand_base.c | 8 +++++++-
4 files changed, 39 insertions(+), 10 deletions(-)
===================================================================
--- a/arch/ia64/kernel/uncached.c
+++ b/arch/ia64/kernel/uncached.c
@@ -253,13 +253,17 @@ static int __init uncached_build_memmap(
int nid = paddr_to_nid(uc_start - __IA64_UNCACHED_OFFSET);
struct gen_pool *pool = uncached_pools[nid].pool;
size_t size = uc_end - uc_start;
-
- touch_softlockup_watchdog();
+ int sl_state;
+
+ sl_state = softlockup_disable();
if (pool != NULL) {
memset((char *)uc_start, 0, size);
(void) gen_pool_add(pool, uc_start, size, nid);
}
+
+ softlockup_enable(sl_state);
+
return 0;
}
===================================================================
--- a/drivers/ide/ide-iops.c
+++ b/drivers/ide/ide-iops.c
@@ -1215,6 +1215,12 @@ int ide_wait_not_busy(ide_hwif_t *hwif,
int ide_wait_not_busy(ide_hwif_t *hwif, unsigned long timeout)
{
u8 stat = 0;
+ int ret = -EBUSY;
+ int sl_state;
+
+ /* Tell softlockup this might take a while.
+ XXX should this be global or per-cpu? */
+ sl_state = softlockup_disable();
while(timeout--) {
/*
@@ -1223,19 +1229,26 @@ int ide_wait_not_busy(ide_hwif_t *hwif,
*/
mdelay(1);
stat = hwif->INB(hwif->io_ports[IDE_STATUS_OFFSET]);
- if ((stat & BUSY_STAT) == 0)
- return 0;
+ if ((stat & BUSY_STAT) == 0) {
+ ret = 0;
+ break;
+ }
+
/*
* Assume a value of 0xff means nothing is connected to
* the interface and it doesn't implement the pull-down
* resistor on D7.
*/
- if (stat == 0xff)
- return -ENODEV;
- touch_softlockup_watchdog();
+ if (stat == 0xff) {
+ ret = -ENODEV;
+ break;
+ }
touch_nmi_watchdog();
}
- return -EBUSY;
+
+ softlockup_enable(sl_state);
+
+ return ret;
}
EXPORT_SYMBOL_GPL(ide_wait_not_busy);
===================================================================
--- a/drivers/ide/ide-taskfile.c
+++ b/drivers/ide/ide-taskfile.c
@@ -310,10 +310,14 @@ static void ide_pio_datablock(ide_drive_
static void ide_pio_datablock(ide_drive_t *drive, struct request *rq,
unsigned int write)
{
+ int sl_state;
+
if (rq->bio) /* fs request */
rq->errors = 0;
- touch_softlockup_watchdog();
+ /* Tell softlockup this might take a while.
+ XXX should this be global or per-cpu? */
+ sl_state = softlockup_disable();
switch (drive->hwif->data_phase) {
case TASKFILE_MULTI_IN:
@@ -324,6 +328,8 @@ static void ide_pio_datablock(ide_drive_
ide_pio_sector(drive, write);
break;
}
+
+ softlockup_enable(sl_state);
}
static ide_startstop_t task_error(ide_drive_t *drive, struct request *rq,
===================================================================
--- a/drivers/mtd/nand/nand_base.c
+++ b/drivers/mtd/nand/nand_base.c
@@ -419,15 +419,21 @@ void nand_wait_ready(struct mtd_info *mt
{
struct nand_chip *chip = mtd->priv;
unsigned long timeo = jiffies + 2;
+ int sl_state;
+
+ /* Tell softlockup this might take a while.
+ XXX should this be global or per-cpu? */
+ sl_state = softlockup_disable();
led_trigger_event(nand_led_trigger, LED_FULL);
/* wait until command is processed or timeout occures */
do {
if (chip->dev_ready(mtd))
break;
- touch_softlockup_watchdog();
} while (time_before(jiffies, timeo));
led_trigger_event(nand_led_trigger, LED_OFF);
+
+ softlockup_enable(sl_state);
}
EXPORT_SYMBOL_GPL(nand_wait_ready);
--
^ permalink raw reply [flat|nested] 33+ messages in thread
* [patch 4/4] Add global disable/enable for softlockup watchdog
2007-03-27 21:49 [patch 0/4] Revised softlockup watchdog improvement patches Jeremy Fitzhardinge
` (2 preceding siblings ...)
2007-03-27 21:49 ` [patch 3/4] Locally disable the softlockup watchdog rather than touching it Jeremy Fitzhardinge
@ 2007-03-27 21:49 ` Jeremy Fitzhardinge
3 siblings, 0 replies; 33+ messages in thread
From: Jeremy Fitzhardinge @ 2007-03-27 21:49 UTC (permalink / raw)
To: Ingo Molnar
Cc: Prarit Bhargava, virtualization, Andrew Morton, Linux Kernel,
Eric Dumazet
[-- Attachment #1: softlockup-global-enable.patch --]
[-- Type: text/plain, Size: 7487 bytes --]
Some machine-wide activities can cause spurious softlockup watchdog
warnings, so add a mechanism to allow the watchdog to be disabled.
The most obvious activity is suspend/resume, but long sysrq output can
also stall the system long enough to cause problems.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Chris Lalancette <clalance@redhat.com>
Cc: Eric Dumazet <dada1@cosmosbay.com>
---
drivers/char/sysrq.c | 8 +++++
include/linux/sched.h | 10 ++++++
kernel/panic.c | 3 +-
kernel/power/swsusp.c | 3 +-
kernel/softlockup.c | 72 ++++++++++++++++++++++++++++++++++++++++++-------
kernel/timer.c | 4 ++
6 files changed, 87 insertions(+), 13 deletions(-)
===================================================================
--- a/drivers/char/sysrq.c
+++ b/drivers/char/sysrq.c
@@ -211,7 +211,11 @@ static struct sysrq_key_op sysrq_showreg
static void sysrq_handle_showstate(int key, struct tty_struct *tty)
{
+ softlockup_global_disable(); /* may take a while */
+
show_state();
+
+ softlockup_global_enable();
}
static struct sysrq_key_op sysrq_showstate_op = {
.handler = sysrq_handle_showstate,
@@ -222,7 +226,11 @@ static struct sysrq_key_op sysrq_showsta
static void sysrq_handle_showstate_blocked(int key, struct tty_struct *tty)
{
+ softlockup_global_disable(); /* may take a while */
+
show_state_filter(TASK_UNINTERRUPTIBLE);
+
+ softlockup_global_enable();
}
static struct sysrq_key_op sysrq_showstate_blocked_op = {
.handler = sysrq_handle_showstate_blocked,
===================================================================
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -243,6 +243,10 @@ extern int softlockup_disable(void);
extern int softlockup_disable(void);
extern void softlockup_enable(int state);
+/* Disable/re-enable softlockup watchdog on all CPUs */
+extern void softlockup_global_disable(void);
+extern void softlockup_global_enable(void);
+
extern void spawn_softlockup_task(void);
extern void touch_softlockup_watchdog(void);
#else
@@ -263,6 +267,12 @@ static inline void softlockup_enable(int
static inline void softlockup_enable(int state)
{
preempt_enable();
+}
+static inline void softlockup_global_enable(void)
+{
+}
+static inline void softlockup_global_disable(void)
+{
}
static inline void spawn_softlockup_task(void)
{
===================================================================
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -131,8 +131,9 @@ NORET_TYPE void panic(const char * fmt,
disabled_wait(caller);
#endif
local_irq_enable();
+ softlockup_global_disable();
+
for (i = 0;;) {
- touch_softlockup_watchdog();
i += panic_blink(i);
mdelay(1);
i++;
===================================================================
--- a/kernel/power/swsusp.c
+++ b/kernel/power/swsusp.c
@@ -289,6 +289,7 @@ int swsusp_suspend(void)
* that suspended with irqs off ... no overall powerup.
*/
device_power_up();
+ softlockup_global_disable();
Enable_irqs:
local_irq_enable();
return error;
@@ -323,7 +324,7 @@ int swsusp_resume(void)
*/
swsusp_free();
restore_processor_state();
- touch_softlockup_watchdog();
+ softlockup_global_enable();
device_power_up();
local_irq_enable();
return error;
===================================================================
--- a/kernel/softlockup.c
+++ b/kernel/softlockup.c
@@ -17,10 +17,21 @@
static DEFINE_SPINLOCK(print_lock);
+/*
+ * Since sched_clock() is inherently per-cpu, its not possible to
+ * update another CPU's timestamp. To deal with this, we add an extra
+ * state meaning "enabled, but timestamp needs update".
+ */
+enum state {
+ SL_OFF = 0, /* disabled */
+ SL_UPDATE, /* enabled, but timestamp old */
+ SL_ON, /* enabled */
+};
+
static DEFINE_PER_CPU(unsigned long, touch_timestamp);
static DEFINE_PER_CPU(unsigned long, print_timestamp);
static DEFINE_PER_CPU(struct task_struct *, watchdog_task);
-static DEFINE_PER_CPU(int, enabled);
+static DEFINE_PER_CPU(enum state, softlock_state);
static int did_panic = 0;
@@ -48,7 +59,12 @@ static unsigned long get_timestamp(void)
void inline touch_softlockup_watchdog(void)
{
+ if (__raw_get_cpu_var(softlock_state) == SL_OFF)
+ return;
+
__raw_get_cpu_var(touch_timestamp) = get_timestamp();
+ barrier();
+ __raw_get_cpu_var(softlock_state) = SL_ON;
}
EXPORT_SYMBOL(touch_softlockup_watchdog);
@@ -58,7 +74,7 @@ EXPORT_SYMBOL(touch_softlockup_watchdog)
*/
void inline softlockup_tick_disable(void)
{
- __get_cpu_var(enabled) = 0;
+ __get_cpu_var(softlock_state) = SL_OFF;
}
/*
@@ -73,7 +89,7 @@ int softlockup_disable(void)
preempt_disable();
- ret = __get_cpu_var(enabled);
+ ret = __get_cpu_var(softlock_state) == SL_OFF;
softlockup_tick_disable();
return ret;
@@ -86,7 +102,7 @@ EXPORT_SYMBOL(softlockup_disable);
*/
void inline softlockup_tick_enable(void)
{
- __get_cpu_var(enabled) = 1;
+ __get_cpu_var(softlock_state) = SL_UPDATE;
}
/*
@@ -96,15 +112,41 @@ void softlockup_enable(int state)
void softlockup_enable(int state)
{
if (state) {
- touch_softlockup_watchdog();
- /* update timestamp before enable */
- barrier();
softlockup_tick_enable();
+ touch_softlockup_watchdog();
}
preempt_enable();
}
EXPORT_SYMBOL(softlockup_enable);
+
+/*
+ * Disable softlockup watchdog on all CPUs. This is useful for
+ * globally disruptive activities, like suspend/resume or large sysrq
+ * debug outputs.
+ */
+void softlockup_global_disable()
+{
+ unsigned cpu;
+
+ for_each_online_cpu(cpu)
+ per_cpu(softlock_state, cpu) = SL_OFF;
+}
+EXPORT_SYMBOL(softlockup_global_disable);
+
+/*
+ * Globally re-enable soft lockups. This will obviously interfere
+ * with any CPU's local softlockup disable, but with luck that won't
+ * matter.
+ */
+void softlockup_global_enable()
+{
+ unsigned cpu;
+
+ for_each_online_cpu(cpu)
+ per_cpu(softlock_state, cpu) = SL_UPDATE;
+}
+EXPORT_SYMBOL(softlockup_global_enable);
/*
* This callback runs from the timer interrupt, and checks
@@ -117,9 +159,19 @@ void softlockup_tick(void)
unsigned long print_timestamp;
unsigned long now;
- /* return if not enabled */
- if (!__get_cpu_var(enabled))
- return;
+ switch(__get_cpu_var(softlock_state)) {
+ case SL_OFF:
+ /* not enabled */
+ return;
+
+ case SL_UPDATE:
+ /* update timestamp */
+ touch_softlockup_watchdog();
+ return;
+
+ case SL_ON:
+ break;
+ }
print_timestamp = __get_cpu_var(print_timestamp);
===================================================================
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -1011,7 +1011,7 @@ static int timekeeping_resume(struct sys
timekeeping_suspended = 0;
write_sequnlock_irqrestore(&xtime_lock, flags);
- touch_softlockup_watchdog();
+ softlockup_global_enable();
clockevents_notify(CLOCK_EVT_NOTIFY_RESUME, NULL);
@@ -1029,6 +1029,8 @@ static int timekeeping_suspend(struct sy
timekeeping_suspended = 1;
timekeeping_suspend_time = read_persistent_clock();
write_sequnlock_irqrestore(&xtime_lock, flags);
+
+ softlockup_global_disable();
clockevents_notify(CLOCK_EVT_NOTIFY_SUSPEND, NULL);
--
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch 3/4] Locally disable the softlockup watchdog rather than touching it
2007-03-27 21:49 ` [patch 3/4] Locally disable the softlockup watchdog rather than touching it Jeremy Fitzhardinge
@ 2007-03-28 13:33 ` Prarit Bhargava
2007-03-28 13:50 ` Andi Kleen
2007-03-28 14:44 ` Jeremy Fitzhardinge
0 siblings, 2 replies; 33+ messages in thread
From: Prarit Bhargava @ 2007-03-28 13:33 UTC (permalink / raw)
To: Jeremy Fitzhardinge
Cc: virtualization, Andrew Morton, Ingo Molnar, John Hawkes,
Linux Kernel, Eric Dumazet
Jeremy Fitzhardinge wrote:
>
> I haven't really worked out how this should interact with the nmi
> watchdog; touch_nmi_watchdog() still ends up calling
> touch_softlockup_watchdog(), so there's still some redundancy here.
>
>
touch_nmi_watchdog is attempting to tickle _all_ CPUs softlockup watchdogs.
Currently, the code is incorrect -- it is calling
touch_softlockup_watchdog which touches only the current CPU's
softlockup watchdog.
I don't like the idea of having touch_softlockup_watchdog exported with
your new code -- we still have two methods of effecting the softlockup
watchdog and that's confusing and its going to cause serious problems
down the road. The nmi watchdog code seems fine with just touching the
CPU's nmi watchdogs.
Is there a reason that you're pushing the enable/disable? All the cases
called out seem to be just fine with calls to either effect that CPU's
softlockup watchdog or doing all CPU's softlockup watchdogs. I'm not
sure I see the benefit of complicating the softlockup watchdog code with
this ...
I agree with the first patch of this set -- it makes sense. But beyond
that I'm not convinced the rest of the code is needed ... IMO.
P.
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch 3/4] Locally disable the softlockup watchdog rather than touching it
2007-03-28 13:33 ` Prarit Bhargava
@ 2007-03-28 13:50 ` Andi Kleen
2007-03-28 14:00 ` Prarit Bhargava
2007-03-28 14:44 ` Jeremy Fitzhardinge
1 sibling, 1 reply; 33+ messages in thread
From: Andi Kleen @ 2007-03-28 13:50 UTC (permalink / raw)
To: virtualization
Cc: Prarit Bhargava, Jeremy Fitzhardinge, virtualization,
Andrew Morton, Ingo Molnar, John Hawkes, Linux Kernel,
Eric Dumazet
On Wednesday 28 March 2007 15:33, Prarit Bhargava wrote:
>
> Jeremy Fitzhardinge wrote:
> >
> > I haven't really worked out how this should interact with the nmi
> > watchdog; touch_nmi_watchdog() still ends up calling
> > touch_softlockup_watchdog(), so there's still some redundancy here.
> >
> >
>
> touch_nmi_watchdog is attempting to tickle _all_ CPUs softlockup watchdogs.
It is supposed to only touch the current CPU, just like it only touches
the NMI watchdog on the current CPU.
>
> Currently, the code is incorrect -- it is calling
> touch_softlockup_watchdog which touches only the current CPU's
> softlockup watchdog.
Sounds correct to me.
-Andi
>
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch 3/4] Locally disable the softlockup watchdog rather than touching it
2007-03-28 13:50 ` Andi Kleen
@ 2007-03-28 14:00 ` Prarit Bhargava
2007-03-28 14:09 ` Andi Kleen
0 siblings, 1 reply; 33+ messages in thread
From: Prarit Bhargava @ 2007-03-28 14:00 UTC (permalink / raw)
To: Andi Kleen
Cc: virtualization, Jeremy Fitzhardinge, virtualization,
Andrew Morton, Ingo Molnar, John Hawkes, Linux Kernel,
Eric Dumazet
>> touch_nmi_watchdog is attempting to tickle _all_ CPUs softlockup watchdogs.
>>
>
> It is supposed to only touch the current CPU, just like it only touches
> the NMI watchdog on the current CPU.
>
>
Andi,
(sorry for the cut-and-paste).
touch_nmi_watchdogs sets EACH CPUs alert_counter to 0.
void touch_nmi_watchdog (void)
{
if (nmi_watchdog > 0) {
unsigned cpu;
/*
* Just reset the alert counters, (other CPUs might be
* spinning on locks we hold):
*/
for_each_present_cpu (cpu)
alert_counter[cpu] = 0;
}
/*
* Tickle the softlockup detector too:
*/
touch_softlockup_watchdog();
}
The call to touch_softlockup_watchdog here is incorrect -- it is only
touching the current CPU's softlockup.
P.
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch 3/4] Locally disable the softlockup watchdog rather than touching it
2007-03-28 14:00 ` Prarit Bhargava
@ 2007-03-28 14:09 ` Andi Kleen
2007-03-28 14:13 ` Prarit Bhargava
0 siblings, 1 reply; 33+ messages in thread
From: Andi Kleen @ 2007-03-28 14:09 UTC (permalink / raw)
To: Prarit Bhargava
Cc: virtualization, Jeremy Fitzhardinge, virtualization,
Andrew Morton, Ingo Molnar, John Hawkes, Linux Kernel,
Eric Dumazet
On Wednesday 28 March 2007 16:00, Prarit Bhargava wrote:
>
> >> touch_nmi_watchdog is attempting to tickle _all_ CPUs softlockup watchdogs.
> >>
> >
> > It is supposed to only touch the current CPU, just like it only touches
> > the NMI watchdog on the current CPU.
> >
> >
>
> Andi,
>
> (sorry for the cut-and-paste).
>
> touch_nmi_watchdogs sets EACH CPUs alert_counter to 0.
You're right. Sorry for the confusion.
But just touching the current CPU would make much more sense. After all
the caller doesn't know anything about the state of other CPUs. Perhaps it would be best
to just change that and keep the softlockup semantics.
-Andi
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch 3/4] Locally disable the softlockup watchdog rather than touching it
2007-03-28 14:09 ` Andi Kleen
@ 2007-03-28 14:13 ` Prarit Bhargava
0 siblings, 0 replies; 33+ messages in thread
From: Prarit Bhargava @ 2007-03-28 14:13 UTC (permalink / raw)
To: Andi Kleen
Cc: virtualization, Jeremy Fitzhardinge, virtualization,
Andrew Morton, Ingo Molnar, John Hawkes, Linux Kernel,
Eric Dumazet
Andi Kleen wrote:
> On Wednesday 28 March 2007 16:00, Prarit Bhargava wrote:
>
>>>> touch_nmi_watchdog is attempting to tickle _all_ CPUs softlockup watchdogs.
>>>>
>>>>
>>> It is supposed to only touch the current CPU, just like it only touches
>>> the NMI watchdog on the current CPU.
>>>
>>>
>>>
>> Andi,
>>
>> (sorry for the cut-and-paste).
>>
>> touch_nmi_watchdogs sets EACH CPUs alert_counter to 0.
>>
>
> You're right. Sorry for the confusion.
>
> But just touching the current CPU would make much more sense. After all
> the caller doesn't know anything about the state of other CPUs. Perhaps it would be best
> to just change that and keep the softlockup semantics.
>
Yeah -- you're probably right, and besides that we're not seeing a crazy
# of softlockup messages after touch_nmi_watchdogs calls.
My original comments regarding the code still stand though -- we
shouldn't have multiple methods of playing with the softlockup watchdog.
P.
> -Andi
>
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch 3/4] Locally disable the softlockup watchdog rather than touching it
2007-03-28 13:33 ` Prarit Bhargava
2007-03-28 13:50 ` Andi Kleen
@ 2007-03-28 14:44 ` Jeremy Fitzhardinge
2007-03-28 14:51 ` Prarit Bhargava
1 sibling, 1 reply; 33+ messages in thread
From: Jeremy Fitzhardinge @ 2007-03-28 14:44 UTC (permalink / raw)
To: Prarit Bhargava
Cc: virtualization, Andrew Morton, Ingo Molnar, John Hawkes,
Linux Kernel, Eric Dumazet
Prarit Bhargava wrote:
> I don't like the idea of having touch_softlockup_watchdog exported
> with your new code -- we still have two methods of effecting the
> softlockup watchdog and that's confusing and its going to cause
> serious problems down the road.
It's legacy. There are a few places where it wasn't obvious to me how
to replace the touch_softlockup_watchdog, so I left them for now. But
ideally I think they should all go away.
> Is there a reason that you're pushing the enable/disable? All the
> cases called out seem to be just fine with calls to either effect that
> CPU's softlockup watchdog or doing all CPU's softlockup watchdogs.
Doing all CPUs is meaningless to me. How does that make sense? It
might work in the sense that messages go away, but doesn't it just hide
the fact that one CPU has gone into a spin?
> I agree with the first patch of this set -- it makes sense. But
> beyond that I'm not convinced the rest of the code is needed ... IMO.
Zach has reported seeing spurious softlockup messages on idle machines
running under a hypervisor. And there was also the discussion about how
to deal with a flash update system in which all CPUs are taken over by
the bios for a long period of time, which was causing softlockup to
trigger. It seemed to me that these could all be dealt with in much the
same way, and that disable/enable semantics for dealing with
long-running timer holdoffs is more natural than trying to work out how
to periodically touch the watchdog timer.
J
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch 3/4] Locally disable the softlockup watchdog rather than touching it
2007-03-28 14:44 ` Jeremy Fitzhardinge
@ 2007-03-28 14:51 ` Prarit Bhargava
2007-03-28 15:22 ` Jeremy Fitzhardinge
0 siblings, 1 reply; 33+ messages in thread
From: Prarit Bhargava @ 2007-03-28 14:51 UTC (permalink / raw)
To: Jeremy Fitzhardinge
Cc: virtualization, Andrew Morton, Ingo Molnar, John Hawkes,
Linux Kernel, Eric Dumazet
Jeremy Fitzhardinge wrote:
> Prarit Bhargava wrote:
>
>> I don't like the idea of having touch_softlockup_watchdog exported
>> with your new code -- we still have two methods of effecting the
>> softlockup watchdog and that's confusing and its going to cause
>> serious problems down the road.
>>
>
> It's legacy. There are a few places where it wasn't obvious to me how
> to replace the touch_softlockup_watchdog, so I left them for now. But
> ideally I think they should all go away.
>
>
>> Is there a reason that you're pushing the enable/disable? All the
>> cases called out seem to be just fine with calls to either effect that
>> CPU's softlockup watchdog or doing all CPU's softlockup watchdogs.
>>
>
> Doing all CPUs is meaningless to me. How does that make sense? It
>
You don't have to do them all -- you could do one with (as in my
previous patch -- which I'm not married to BTW ;) )
touch_cpu_softlockup_watchdog()
and all with
touch_softlockup_watchdog()
> Zach has reported seeing spurious softlockup messages on idle machines
> running under a hypervisor. And there was also the discussion about how
> to deal with a flash update system in which all CPUs are taken over by
> the bios for a long period of time, which was causing softlockup to
> trigger. It seemed to me that these could all be dealt with in much the
> same way, and that disable/enable semantics for dealing with
> long-running timer holdoffs is more natural than trying to work out how
> to periodically touch the watchdog timer.
>
>
But wouldn't a call to touch_[cpu_]softlockup_watchdog at the end of the
flash update fix the problem? And ditto for all other areas of the
kernel where we know we're holding off scheduling?
P.
> J
>
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch 3/4] Locally disable the softlockup watchdog rather than touching it
2007-03-28 14:51 ` Prarit Bhargava
@ 2007-03-28 15:22 ` Jeremy Fitzhardinge
2007-03-28 15:27 ` Prarit Bhargava
0 siblings, 1 reply; 33+ messages in thread
From: Jeremy Fitzhardinge @ 2007-03-28 15:22 UTC (permalink / raw)
To: Prarit Bhargava
Cc: Ingo Molnar, Linux Kernel, virtualization, Eric Dumazet,
Andrew Morton, Chris Lalancette, John Hawkes
Prarit Bhargava wrote:
> You don't have to do them all -- you could do one with (as in my
> previous patch -- which I'm not married to BTW ;) )
>
> touch_cpu_softlockup_watchdog()
>
> and all with
>
> touch_softlockup_watchdog()
Well, I think changing the meaning of touch_softlockup_watchdog() for
all existing callers is wrong - even if you change most of them to refer
to the cpu-local function. There are definitely specific occasions on
which touching all CPUs is the right thing to do, but not in the general
case.
The only thing I really care about in my patches is ignoring stolen
time. It may be that fixing that is enough to fix the reported problems
with spurious watchdog messages on tickless idle CPUs.
J
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch 3/4] Locally disable the softlockup watchdog rather than touching it
2007-03-28 15:22 ` Jeremy Fitzhardinge
@ 2007-03-28 15:27 ` Prarit Bhargava
0 siblings, 0 replies; 33+ messages in thread
From: Prarit Bhargava @ 2007-03-28 15:27 UTC (permalink / raw)
To: Jeremy Fitzhardinge
Cc: virtualization, Andrew Morton, Ingo Molnar, John Hawkes,
Linux Kernel, Eric Dumazet
Jeremy Fitzhardinge wrote:
> Prarit Bhargava wrote:
>
>> You don't have to do them all -- you could do one with (as in my
>> previous patch -- which I'm not married to BTW ;) )
>>
>> touch_cpu_softlockup_watchdog()
>>
>> and all with
>>
>> touch_softlockup_watchdog()
>>
>
> Well, I think changing the meaning of touch_softlockup_watchdog() for
> all existing callers is wrong - even if you change most of them to refer
> to the cpu-local function.
Hmmm .... it was suggested to me that I should mimic what
touch_nmi_watchdog() does.
> There are definitely specific occasions on
> which touching all CPUs is the right thing to do, but not in the general
> case.
>
Yep. That's why I have both a single cpu touch and the whole shebang :)
> The only thing I really care about in my patches is ignoring stolen
> time. It may be that fixing that is enough to fix the reported problems
> with spurious watchdog messages on tickless idle CPUs.
>
>
> J
>
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch 1/4] Ignore stolen time in the softlockup watchdog
2007-03-27 21:49 ` [patch 1/4] Ignore stolen time in the softlockup watchdog Jeremy Fitzhardinge
@ 2007-04-24 6:49 ` Andrew Morton
2007-04-24 6:58 ` Jeremy Fitzhardinge
2007-04-24 17:51 ` Jeremy Fitzhardinge
0 siblings, 2 replies; 33+ messages in thread
From: Andrew Morton @ 2007-04-24 6:49 UTC (permalink / raw)
To: Jeremy Fitzhardinge
Cc: Bhargava, Thomas, Lindsley, Prarit, john stultz, Zachary,
Linux Kernel, Eric Dumazet, virtualization, Chris Lalancette,
Paul Mackerras, Rick, Martin Schwidefsky, Ingo Molnar, Gleixner
On Tue, 27 Mar 2007 14:49:20 -0700 Jeremy Fitzhardinge <jeremy@goop.org> wrote:
> The softlockup watchdog is currently a nuisance in a virtual machine,
> since the whole system could have the CPU stolen from it for a long
> period of time. While it would be unlikely for a guest domain to be
> denied timer interrupts for over 10s, it could happen and any softlockup
> message would be completely spurious.
>
> Earlier I proposed that sched_clock() return time in unstolen
> nanoseconds, which is how Xen and VMI currently implement it. If the
> softlockup watchdog uses sched_clock() to measure time, it would
> automatically ignore stolen time, and therefore only report when the
> guest itself locked up. When running native, sched_clock() returns
> real-time nanoseconds, so the behaviour would be unchanged.
>
> Note that sched_clock() used this way is inherently per-cpu, so this
> patch makes sure that the per-processor watchdog thread initialized
> its own timestamp.
This patch
(ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc6/2.6.21-rc6-mm1/broken-out/ignore-stolen-time-in-the-softlockup-watchdog.patch)
causes six failures in the locking self-tests, which I must say is rather
clever of it.
Here's the first one:
[17179569.184000] Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
[17179569.184000] ... MAX_LOCKDEP_SUBCLASSES: 8
[17179569.184000] ... MAX_LOCK_DEPTH: 30
[17179569.184000] ... MAX_LOCKDEP_KEYS: 2048
[17179569.184000] ... CLASSHASH_SIZE: 1024
[17179569.184000] ... MAX_LOCKDEP_ENTRIES: 8192
[17179569.184000] ... MAX_LOCKDEP_CHAINS: 16384
[17179569.184000] ... CHAINHASH_SIZE: 8192
[17179569.184000] memory used by lock dependency info: 992 kB
[17179569.184000] per task-struct memory footprint: 1200 bytes
[17179569.184000] ------------------------
[17179569.184000] | Locking API testsuite:
[17179569.184000] ----------------------------------------------------------------------------
[17179569.184000] | spin |wlock |rlock |mutex | wsem | rsem |
[17179569.184000] --------------------------------------------------------------------------
[17179569.184000] A-A deadlock: ok | ok | ok | ok | ok | ok |
[17179569.184000] A-B-B-A deadlock: ok | ok | ok | ok | ok | ok |
[17179569.184000] A-B-B-C-C-A deadlock: ok | ok | ok | ok | ok | ok |
[17179569.184001] A-B-C-A-B-C deadlock: ok | ok | ok | ok | ok | ok |
[17179569.184002] A-B-B-C-C-D-D-A deadlock: ok | ok | ok | ok | ok | ok |
[17179569.184003] A-B-C-D-B-D-D-A deadlock: ok | ok | ok | ok | ok | ok |
[17179569.184004] A-B-C-D-B-C-D-A deadlock: ok | ok | ok | ok | ok | ok |
[17179569.184005] double unlock: ok | ok | ok | ok | ok | ok |
[17179569.184006] initialize held: ok | ok | ok | ok | ok | ok |
[17179569.184006] bad unlock order: ok | ok | ok | ok | ok | ok |
[17179569.184006] --------------------------------------------------------------------------
[17179569.184006] recursive read-lock: | ok | | ok |
[17179569.184006] recursive read-lock #2: | ok | | ok |
[17179569.184007] mixed read-write-lock: | ok | | ok |
[17179569.184007] mixed write-read-lock: | ok | | ok |
[17179569.184007] --------------------------------------------------------------------------
[17179569.184007] hard-irqs-on + irq-safe-A/12: ok | ok | ok |
[17179569.184007] soft-irqs-on + irq-safe-A/12: ok | ok | ok |
[17179569.184007] hard-irqs-on + irq-safe-A/21: ok | ok | ok |
[17179569.184007] soft-irqs-on + irq-safe-A/21: ok | ok | ok |
[17179569.184007] sirq-safe-A => hirqs-on/12: ok | ok |irq event stamp: 458
[17179569.184007] hardirqs last enabled at (458): [<c01e4116>] irqsafe2A_rlock_12+0x96/0xa3
[17179569.184007] hardirqs last disabled at (457): [<c01095b9>] sched_clock+0x5e/0xe9
[17179569.184007] softirqs last enabled at (454): [<c01e4101>] irqsafe2A_rlock_12+0x81/0xa3
[17179569.184007] softirqs last disabled at (450): [<c01e408b>] irqsafe2A_rlock_12+0xb/0xa3
[17179569.184007] FAILED| [<c0104cf0>] dump_trace+0x63/0x1ec
[17179569.184007] [<c0104e93>] show_trace_log_lvl+0x1a/0x30
[17179569.184007] [<c01059ec>] show_trace+0x12/0x14
[17179569.184007] [<c0105a45>] dump_stack+0x16/0x18
[17179569.184007] [<c01e1eb5>] dotest+0x6b/0x3d0
[17179569.184007] [<c01eb249>] locking_selftest+0x915/0x1a58
[17179569.184007] [<c048c979>] start_kernel+0x1d0/0x2a2
[17179569.184007] =======================
[17179569.184007]
[17179569.184007] sirq-safe-A => hirqs-on/21:irq event stamp: 462
[17179569.184007] hardirqs last enabled at (462): [<c01e3eb2>] irqsafe2A_spin_21+0x1b/0xa3
[17179569.184007] hardirqs last disabled at (461): [<c01095b9>] sched_clock+0x5e/0xe9
[17179569.184007] softirqs last enabled at (454): [<c01e4101>] irqsafe2A_rlock_12+0x81/0xa3
[17179569.184007] softirqs last disabled at (450): [<c01e408b>] irqsafe2A_rlock_12+0xb/0xa3
[17179569.184007] ok |irq event stamp: 466
It's a challenge to even find the code which corresponds with this failure,
so good luck.
It seems fairly sensitive to .config settings. See
http://userweb.kernel.org/~akpm/config-sony.txt
Unfortunately this causes lockdep to disable itself, thus hiding all the
other bugs which people have contributed, so I'll drop your patch for now.
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch 1/4] Ignore stolen time in the softlockup watchdog
2007-04-24 6:49 ` Andrew Morton
@ 2007-04-24 6:58 ` Jeremy Fitzhardinge
2007-04-24 7:09 ` Andrew Morton
2007-04-24 17:51 ` Jeremy Fitzhardinge
1 sibling, 1 reply; 33+ messages in thread
From: Jeremy Fitzhardinge @ 2007-04-24 6:58 UTC (permalink / raw)
To: Andrew Morton
Cc: Ingo Molnar, Linux Kernel, virtualization, Prarit Bhargava,
Eric Dumazet, Thomas Gleixner, john stultz, Zachary Amsden,
James Morris, Dan Hecht, Paul Mackerras, Martin Schwidefsky,
Chris Lalancette, Rick Lindsley
Andrew Morton wrote:
> On Tue, 27 Mar 2007 14:49:20 -0700 Jeremy Fitzhardinge <jeremy@goop.org> wrote:
>
>
>> The softlockup watchdog is currently a nuisance in a virtual machine,
>> since the whole system could have the CPU stolen from it for a long
>> period of time. While it would be unlikely for a guest domain to be
>> denied timer interrupts for over 10s, it could happen and any softlockup
>> message would be completely spurious.
>>
>> Earlier I proposed that sched_clock() return time in unstolen
>> nanoseconds, which is how Xen and VMI currently implement it. If the
>> softlockup watchdog uses sched_clock() to measure time, it would
>> automatically ignore stolen time, and therefore only report when the
>> guest itself locked up. When running native, sched_clock() returns
>> real-time nanoseconds, so the behaviour would be unchanged.
>>
>> Note that sched_clock() used this way is inherently per-cpu, so this
>> patch makes sure that the per-processor watchdog thread initialized
>> its own timestamp.
>>
>
> This patch
> (ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc6/2.6.21-rc6-mm1/broken-out/ignore-stolen-time-in-the-softlockup-watchdog.patch)
> causes six failures in the locking self-tests, which I must say is rather
> clever of it.
>
Interesting. Which variation of sched_clock do you have in your tree at
the moment?
J
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch 1/4] Ignore stolen time in the softlockup watchdog
2007-04-24 6:58 ` Jeremy Fitzhardinge
@ 2007-04-24 7:09 ` Andrew Morton
0 siblings, 0 replies; 33+ messages in thread
From: Andrew Morton @ 2007-04-24 7:09 UTC (permalink / raw)
To: Jeremy Fitzhardinge
Cc: Bhargava, Thomas, Lindsley, Prarit, john stultz, Zachary,
Linux Kernel, Eric Dumazet, virtualization, Chris Lalancette,
Paul Mackerras, Rick, Martin Schwidefsky, Ingo Molnar, Gleixner
On Mon, 23 Apr 2007 23:58:20 -0700 Jeremy Fitzhardinge <jeremy@goop.org> wrote:
> Andrew Morton wrote:
> > On Tue, 27 Mar 2007 14:49:20 -0700 Jeremy Fitzhardinge <jeremy@goop.org> wrote:
> >
> >
> >> The softlockup watchdog is currently a nuisance in a virtual machine,
> >> since the whole system could have the CPU stolen from it for a long
> >> period of time. While it would be unlikely for a guest domain to be
> >> denied timer interrupts for over 10s, it could happen and any softlockup
> >> message would be completely spurious.
> >>
> >> Earlier I proposed that sched_clock() return time in unstolen
> >> nanoseconds, which is how Xen and VMI currently implement it. If the
> >> softlockup watchdog uses sched_clock() to measure time, it would
> >> automatically ignore stolen time, and therefore only report when the
> >> guest itself locked up. When running native, sched_clock() returns
> >> real-time nanoseconds, so the behaviour would be unchanged.
> >>
> >> Note that sched_clock() used this way is inherently per-cpu, so this
> >> patch makes sure that the per-processor watchdog thread initialized
> >> its own timestamp.
> >>
> >
> > This patch
> > (ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc6/2.6.21-rc6-mm1/broken-out/ignore-stolen-time-in-the-softlockup-watchdog.patch)
> > causes six failures in the locking self-tests, which I must say is rather
> > clever of it.
> >
>
> Interesting.
I'll say.
> Which variation of sched_clock do you have in your tree at
> the moment?
Andi's, plus the below fix.
Sigh. I thought I was only two more bugs away from a release, then...
[18014389.347124] BUG: unable to handle kernel paging request at virtual address 6b6b7193
[18014389.347142] printing eip:
[18014389.347149] c029a80c
[18014389.347156] *pde = 00000000
[18014389.347166] Oops: 0000 [#1]
[18014389.347174] Modules linked in: i915 drm ipw2200 sonypi ipv6 autofs4 hidp l2cap bluetooth sunrpc nf_conntrack_netbios_ns ipt_REJECT nf_conntrack_ipv4 xt_state nf_conntrack nfnetlink xt_tcpudp iptable_filter ip_tables x_tables cpufreq_ondemand video sbs button battery asus_acpi ac nvram ohci1394 ieee1394 ehci_hcd uhci_hcd sg joydev snd_hda_intel snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm sr_mod cdrom snd_timer ieee80211 i2c_i801 piix ieee80211_crypt i2c_core generic snd soundcore snd_page_alloc ext3 jbd ide_disk ide_core
[18014389.347520] CPU: 0
[18014389.347521] EIP: 0060:[<c029a80c>] Tainted: G D VLI
[18014389.347522] EFLAGS: 00010296 (2.6.21-rc7-mm1 #35)
[18014389.347547] EIP is at input_release_device+0x8/0x4e
[18014389.347555] eax: c99709a8 ebx: 6b6b6b6b ecx: 00000286 edx: 00000000
[18014389.347563] esi: 6b6b6b6b edi: c99709cc ebp: c21e3d40 esp: c21e3d38
[18014389.347571] ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0068
[18014389.347580] Process khubd (pid: 159, ti=c21e2000 task=c20a62f0 task.ti=c21e2000)
[18014389.347588] Stack: 6b6b6b6b c99709a8 c21e3d60 c029b489 c2014ec8 c9182000 c96b167c c9970954
[18014389.347655] c9970954 c99709cc c21e3d80 c029d401 c9977a6c c96b1000 c21e3d90 c9970954
[18014389.347708] c99709a8 c9164000 c21e3d90 c029d4b5 c96b1000 c9970564 c21e3db0 c029c50b
[18014389.347771] Call Trace:
[18014389.347792] [<c029b489>] input_close_device+0x13/0x51
[18014389.347810] [<c029d401>] mousedev_destroy+0x29/0x7e
[18014389.347827] [<c029d4b5>] mousedev_disconnect+0x5f/0x63
[18014389.347842] [<c029c50b>] input_unregister_device+0x6a/0x100
[18014389.347858] [<c02abf9c>] hidinput_disconnect+0x24/0x41
[18014389.347874] [<c02aef29>] hid_disconnect+0x79/0xc9
[18014389.347889] [<c028e1db>] usb_unbind_interface+0x47/0x8f
[18014389.347916] [<c0256852>] __device_release_driver+0x74/0x90
[18014389.347933] [<c0256c5f>] device_release_driver+0x37/0x4e
[18014389.347957] [<c02561c6>] bus_remove_device+0x73/0x82
[18014389.347977] [<c02547c1>] device_del+0x214/0x28c
[18014389.348132] [<c028bb72>] usb_disable_device+0x62/0xc2
[18014389.348148] [<c0288893>] usb_disconnect+0x99/0x126
[18014389.348163] [<c0288d2c>] hub_thread+0x3a5/0xb07
[18014389.348178] [<c012cbe5>] kthread+0x6e/0x79
[18014389.348194] [<c0104917>] kernel_thread_helper+0x7/0x10
[18014389.348210] =======================
[18014389.348218] INFO: lockdep is turned off.
[18014389.348224] Code: 5b 5d c3 55 b9 f0 ff ff ff 8b 50 0c 89 e5 83 ba 28 06 00 00 00 75 08 89 82 28 06 00 00 31 c9 5d 89 c8 c3 55 89 e5 56 53 8b 70 0c <39> 86 28 06 00 00 75 3a 8b 9e e4 08 00 00 c7 86 28 06 00 00 00
I dunno. I'll keep plugging for another couple hours then I'll shove
out what I have as a -mm snapshot whatsit.
Things are just ridiculous. I'm thinking of having a hard-disk crash and
accidentally losing everything.
From: Andrew Morton <akpm@linux-foundation.org>
WARNING: arch/x86_64/kernel/built-in.o - Section mismatch: reference to .init.text:sc_cpu_event from .data between 'sc_cpu_notifier' (at offset 0x2110) and 'mcelog'
Use hotcpu_notifier(). This takes care of making sure that the unused code
disappears from vmlinux if !CONFIG_HOTPLUG_CPU, too.
Please, test allnoconfig builds and watch for the warnings?
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/i386/kernel/sched-clock.c | 7 +------
1 file changed, 1 insertion(+), 6 deletions(-)
diff -puN arch/i386/kernel/sched-clock.c~fix-x86_64-mm-sched-clock-share arch/i386/kernel/sched-clock.c
--- a/arch/i386/kernel/sched-clock.c~fix-x86_64-mm-sched-clock-share
+++ a/arch/i386/kernel/sched-clock.c
@@ -169,20 +169,15 @@ sc_cpu_event(struct notifier_block *self
return NOTIFY_DONE;
}
-static struct notifier_block sc_cpu_notifier = {
- .notifier_call = sc_cpu_event
-};
-
static __init int init_sched_clock(void)
{
struct cpufreq_freqs f = { .cpu = get_cpu(), .new = 0 };
WARN_ON(num_online_cpus() > 1);
call_r_s_f(&f);
put_cpu();
- register_cpu_notifier(&sc_cpu_notifier);
+ hotcpu_notifier(sc_cpu_event, 0);
cpufreq_register_notifier(&sc_freq_notifier,
CPUFREQ_TRANSITION_NOTIFIER);
return 0;
}
core_initcall(init_sched_clock);
-
_
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch 1/4] Ignore stolen time in the softlockup watchdog
2007-04-24 6:49 ` Andrew Morton
2007-04-24 6:58 ` Jeremy Fitzhardinge
@ 2007-04-24 17:51 ` Jeremy Fitzhardinge
2007-04-24 17:57 ` Andrew Morton
1 sibling, 1 reply; 33+ messages in thread
From: Jeremy Fitzhardinge @ 2007-04-24 17:51 UTC (permalink / raw)
To: Andrew Morton
Cc: Ingo Molnar, Linux Kernel, virtualization, Prarit Bhargava,
Eric Dumazet, Thomas Gleixner, john stultz, Zachary Amsden,
James Morris, Dan Hecht, Paul Mackerras, Martin Schwidefsky,
Chris Lalancette, Rick Lindsley, Andi Kleen
Andrew Morton wrote:
> It seems fairly sensitive to .config settings. See
> http://userweb.kernel.org/~akpm/config-sony.txt
>
I haven't tried your config yet, but I haven't managed to reproduce it
by playing with the usual suspects in my config (SMP, PREEMPT). Any
idea about which config changes make the difference?
Hm, is it caused by using sched_clock() to generate the printk
timestamps while generating the lock test output?
J
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch 1/4] Ignore stolen time in the softlockup watchdog
2007-04-24 17:51 ` Jeremy Fitzhardinge
@ 2007-04-24 17:57 ` Andrew Morton
2007-04-24 18:16 ` Jeremy Fitzhardinge
0 siblings, 1 reply; 33+ messages in thread
From: Andrew Morton @ 2007-04-24 17:57 UTC (permalink / raw)
To: Jeremy Fitzhardinge
Cc: Bhargava, Thomas, Lindsley, Prarit, john stultz, Zachary,
Linux Kernel, Eric Dumazet, virtualization, Chris Lalancette,
Paul Mackerras, Rick, Martin Schwidefsky, Ingo Molnar, Gleixner
On Tue, 24 Apr 2007 10:51:35 -0700 Jeremy Fitzhardinge <jeremy@goop.org> wrote:
> Andrew Morton wrote:
> > It seems fairly sensitive to .config settings. See
> > http://userweb.kernel.org/~akpm/config-sony.txt
> >
>
> I haven't tried your config yet, but I haven't managed to reproduce it
> by playing with the usual suspects in my config (SMP, PREEMPT). Any
> idea about which config changes make the difference?
I said that because the damn thing went away when I was hunting it down
because I lost the config and was unable to remember the right combination
of debug settings. Fortunately it later came back so I took care to
preserve the config.
> Hm, is it caused by using sched_clock() to generate the printk
> timestamps while generating the lock test output?
Conceivably. What does that locking API test do?
I was using printk timestamps and netconsole at the time.
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch 1/4] Ignore stolen time in the softlockup watchdog
2007-04-24 17:57 ` Andrew Morton
@ 2007-04-24 18:16 ` Jeremy Fitzhardinge
2007-04-24 18:32 ` Andrew Morton
0 siblings, 1 reply; 33+ messages in thread
From: Jeremy Fitzhardinge @ 2007-04-24 18:16 UTC (permalink / raw)
To: Andrew Morton
Cc: Ingo Molnar, Linux Kernel, virtualization, Prarit Bhargava,
Eric Dumazet, Thomas Gleixner, john stultz, Zachary Amsden,
James Morris, Dan Hecht, Paul Mackerras, Martin Schwidefsky,
Chris Lalancette, Rick Lindsley, Andi Kleen
Andrew Morton wrote:
> I said that because the damn thing went away when I was hunting it down
> because I lost the config and was unable to remember the right combination
> of debug settings. Fortunately it later came back so I took care to
> preserve the config.
>
sched_clock doesn't *do* anything except flap interrupts. Oh, wait, have
you got Andi's bugfixed version of the sched_clock patch? The first
version did a local_save_flags rather than a local_irq_save.
>> Hm, is it caused by using sched_clock() to generate the printk
>> timestamps while generating the lock test output?
>>
>
> Conceivably. What does that locking API test do?
>
Didn't make a difference here. Building your config now.
> I was using printk timestamps and netconsole at the time.
>
Ah, great, now you're going to make me setup netconsole...
J
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch 1/4] Ignore stolen time in the softlockup watchdog
2007-04-24 18:16 ` Jeremy Fitzhardinge
@ 2007-04-24 18:32 ` Andrew Morton
2007-04-24 20:00 ` Jeremy Fitzhardinge
0 siblings, 1 reply; 33+ messages in thread
From: Andrew Morton @ 2007-04-24 18:32 UTC (permalink / raw)
To: Jeremy Fitzhardinge
Cc: Bhargava, Thomas, Lindsley, Prarit, john stultz, Zachary,
Linux Kernel, Eric Dumazet, virtualization, Chris Lalancette,
Paul Mackerras, Rick, Martin Schwidefsky, Ingo Molnar, Gleixner
On Tue, 24 Apr 2007 11:16:09 -0700 Jeremy Fitzhardinge <jeremy@goop.org> wrote:
> Andrew Morton wrote:
> > I said that because the damn thing went away when I was hunting it down
> > because I lost the config and was unable to remember the right combination
> > of debug settings. Fortunately it later came back so I took care to
> > preserve the config.
> >
>
> sched_clock doesn't *do* anything except flap interrupts.
Well, it _is_ mysterious.
Did you try to locate the code which failed? I got lost in macros and
include files, and gave up very very easily. Stop hiding, Ingo.
> Oh, wait, have
> you got Andi's bugfixed version of the sched_clock patch? The first
> version did a local_save_flags rather than a local_irq_save.
I have whatever I pulled from firstfloor over the weekend. It's in
rc7-mm1. No, it doesn't use local_save_flags.
> >> Hm, is it caused by using sched_clock() to generate the printk
> >> timestamps while generating the lock test output?
> >>
> >
> > Conceivably. What does that locking API test do?
> >
>
> Didn't make a difference here. Building your config now.
>
> > I was using printk timestamps and netconsole at the time.
> >
>
> Ah, great, now you're going to make me setup netconsole...
>
That's a doddle.
On test system, boot with
netconsole=4444@<test-system-ip-addr>/eth0,<udp-port-no>@<workstation-ip-addr>/<workstation-mac-addr>
On workstation:
sudo netcat -u -l -p <udp-port-no> | tee -a ~/.log/log-<test-system-hostname>
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch 1/4] Ignore stolen time in the softlockup watchdog
2007-04-24 18:32 ` Andrew Morton
@ 2007-04-24 20:00 ` Jeremy Fitzhardinge
2007-04-24 20:14 ` Andrew Morton
2007-04-24 20:24 ` Jeremy Fitzhardinge
0 siblings, 2 replies; 33+ messages in thread
From: Jeremy Fitzhardinge @ 2007-04-24 20:00 UTC (permalink / raw)
To: Andrew Morton
Cc: Ingo Molnar, Linux Kernel, virtualization, Prarit Bhargava,
Eric Dumazet, Thomas Gleixner, john stultz, Zachary Amsden,
James Morris, Dan Hecht, Paul Mackerras, Martin Schwidefsky,
Chris Lalancette, Rick Lindsley, Andi Kleen
Andrew Morton wrote:
> Well, it _is_ mysterious.
>
> Did you try to locate the code which failed? I got lost in macros and
> include files, and gave up very very easily. Stop hiding, Ingo.
>
OK, I've managed to reproduce it. Removing the local_irq_save/restore
from sched_clock() makes it go away, as I'd expect (otherwise it would
really be magic). But given that it never seems to touch the softlockup
during testing, I have no idea what difference it makes...
J
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch 1/4] Ignore stolen time in the softlockup watchdog
2007-04-24 20:00 ` Jeremy Fitzhardinge
@ 2007-04-24 20:14 ` Andrew Morton
2007-04-24 20:46 ` Jeremy Fitzhardinge
2007-04-24 20:24 ` Jeremy Fitzhardinge
1 sibling, 1 reply; 33+ messages in thread
From: Andrew Morton @ 2007-04-24 20:14 UTC (permalink / raw)
To: Jeremy Fitzhardinge
Cc: Bhargava, Thomas, Lindsley, Prarit, john stultz, Zachary,
Linux Kernel, Eric Dumazet, virtualization, Chris Lalancette,
Paul Mackerras, Rick, Martin Schwidefsky, Ingo Molnar, Gleixner
On Tue, 24 Apr 2007 13:00:49 -0700 Jeremy Fitzhardinge <jeremy@goop.org> wrote:
> Andrew Morton wrote:
> > Well, it _is_ mysterious.
> >
> > Did you try to locate the code which failed? I got lost in macros and
> > include files, and gave up very very easily. Stop hiding, Ingo.
> >
>
> OK, I've managed to reproduce it. Removing the local_irq_save/restore
> from sched_clock() makes it go away, as I'd expect (otherwise it would
> really be magic).
erm, why do you expect that? A local_irq_save()/local_irq_restore() pair
shouldn't be affecting anything?
> But given that it never seems to touch the softlockup
> during testing, I have no idea what difference it makes...
To what softlockup are you referring, and what does that have to do with
anything?
<feels dumb>
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch 1/4] Ignore stolen time in the softlockup watchdog
2007-04-24 20:00 ` Jeremy Fitzhardinge
2007-04-24 20:14 ` Andrew Morton
@ 2007-04-24 20:24 ` Jeremy Fitzhardinge
2007-04-24 20:33 ` Andrew Morton
2007-04-24 20:52 ` Daniel Walker
1 sibling, 2 replies; 33+ messages in thread
From: Jeremy Fitzhardinge @ 2007-04-24 20:24 UTC (permalink / raw)
To: Jeremy Fitzhardinge
Cc: Prarit Bhargava, Rick Lindsley, Thomas Gleixner, john stultz,
Linux Kernel, Eric Dumazet, virtualization, Chris Lalancette,
Paul Mackerras, Martin Schwidefsky, Andrew Morton, Ingo Molnar
Jeremy Fitzhardinge wrote:
> Andrew Morton wrote:
>
>> Well, it _is_ mysterious.
>>
>> Did you try to locate the code which failed? I got lost in macros and
>> include files, and gave up very very easily. Stop hiding, Ingo.
>>
>>
>
> OK, I've managed to reproduce it. Removing the local_irq_save/restore
> from sched_clock() makes it go away, as I'd expect (otherwise it would
> really be magic). But given that it never seems to touch the softlockup
> during testing, I have no idea what difference it makes...
And sched_clock's use of local_irq_save/restore appears to be absolutely
correct, so I think it must be triggering a bug in either the self-tests
or lockdep itself.
The only way I could actually extract the test code itself was to run
the whole thing through cpp+indent, but it doesn't shed much light.
It's also not clear to me if there are 6 independent failures, or if
they're a cascade.
J
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch 1/4] Ignore stolen time in the softlockup watchdog
2007-04-24 20:24 ` Jeremy Fitzhardinge
@ 2007-04-24 20:33 ` Andrew Morton
2007-04-24 20:48 ` Jeremy Fitzhardinge
2007-04-24 20:52 ` Daniel Walker
1 sibling, 1 reply; 33+ messages in thread
From: Andrew Morton @ 2007-04-24 20:33 UTC (permalink / raw)
To: Jeremy Fitzhardinge
Cc: Prarit Bhargava, Rick Lindsley, Thomas Gleixner, john stultz,
Linux Kernel, Eric Dumazet, virtualization, Chris Lalancette,
Mackerras, Martin Schwidefsky, Paul, Ingo Molnar
On Tue, 24 Apr 2007 13:24:24 -0700 Jeremy Fitzhardinge <jeremy@goop.org> wrote:
> Jeremy Fitzhardinge wrote:
> > Andrew Morton wrote:
> >
> >> Well, it _is_ mysterious.
> >>
> >> Did you try to locate the code which failed? I got lost in macros and
> >> include files, and gave up very very easily. Stop hiding, Ingo.
> >>
> >>
> >
> > OK, I've managed to reproduce it. Removing the local_irq_save/restore
> > from sched_clock() makes it go away, as I'd expect (otherwise it would
> > really be magic). But given that it never seems to touch the softlockup
> > during testing, I have no idea what difference it makes...
>
> And sched_clock's use of local_irq_save/restore appears to be absolutely
> correct, so I think it must be triggering a bug in either the self-tests
> or lockdep itself.
It's weird. And I don't think the locking selftest code calls
sched_clock() (or any other time-related thing) at all, does it?
> The only way I could actually extract the test code itself was to run
> the whole thing through cpp+indent, but it doesn't shed much light.
>
> It's also not clear to me if there are 6 independent failures, or if
> they're a cascade.
Oh well. I'll restore the patches and when people hit problems we can
blame Ingo!
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch 1/4] Ignore stolen time in the softlockup watchdog
2007-04-24 20:14 ` Andrew Morton
@ 2007-04-24 20:46 ` Jeremy Fitzhardinge
0 siblings, 0 replies; 33+ messages in thread
From: Jeremy Fitzhardinge @ 2007-04-24 20:46 UTC (permalink / raw)
To: Andrew Morton
Cc: Ingo Molnar, Linux Kernel, virtualization, Prarit Bhargava,
Eric Dumazet, Thomas Gleixner, john stultz, Zachary Amsden,
James Morris, Dan Hecht, Paul Mackerras, Martin Schwidefsky,
Chris Lalancette, Rick Lindsley, Andi Kleen
Andrew Morton wrote:
> On Tue, 24 Apr 2007 13:00:49 -0700 Jeremy Fitzhardinge <jeremy@goop.org> wrote:
>
>
>> Andrew Morton wrote:
>>
>>> Well, it _is_ mysterious.
>>>
>>> Did you try to locate the code which failed? I got lost in macros and
>>> include files, and gave up very very easily. Stop hiding, Ingo.
>>>
>>>
>> OK, I've managed to reproduce it. Removing the local_irq_save/restore
>> from sched_clock() makes it go away, as I'd expect (otherwise it would
>> really be magic).
>>
>
> erm, why do you expect that? A local_irq_save()/local_irq_restore() pair
> shouldn't be affecting anything?
>
Well, yes. I have no idea why it causes a problem. But other than
that, sched_clock does absolutely nothing which would affect lockdep state.
>> But given that it never seems to touch the softlockup
>> during testing, I have no idea what difference it makes...
>>
>
> To what softlockup are you referring, and what does that have to do with
> anything?
You dropped this patch, "Ignore stolen time in the softlockup watchdog"
because its presence triggers the lock tester errors. The only thing
this patch does is use sched_clock() rather than jiffies to measure
lockup time. It therefore appears, for some reason, that using
sched_clock() in the softlockup code is making the lock-test fail.
Since the lock test doesn't explicitly do any softlockup stuff, the
connection must be implicit via sched_lock - but how, I can't imagine.
Since sched_clock() itself looks perfectly OK, and the softlockup
watchdog seems fine too, I can only conclude its a bug in the lock
testing stuff. But I don't know what.
J
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch 1/4] Ignore stolen time in the softlockup watchdog
2007-04-24 20:33 ` Andrew Morton
@ 2007-04-24 20:48 ` Jeremy Fitzhardinge
0 siblings, 0 replies; 33+ messages in thread
From: Jeremy Fitzhardinge @ 2007-04-24 20:48 UTC (permalink / raw)
To: Andrew Morton
Cc: Prarit Bhargava, Rick Lindsley, john stultz, Linux Kernel,
Eric Dumazet, virtualization, Chris Lalancette, Paul Mackerras,
Martin Schwidefsky, Ingo Molnar, Thomas Gleixner
Andrew Morton wrote:
> It's weird. And I don't think the locking selftest code calls
> sched_clock() (or any other time-related thing) at all, does it?
>
I guess it ends up going through the scheduler, which does use it.
But... <shrug>
J
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch 1/4] Ignore stolen time in the softlockup watchdog
2007-04-24 20:24 ` Jeremy Fitzhardinge
2007-04-24 20:33 ` Andrew Morton
@ 2007-04-24 20:52 ` Daniel Walker
2007-04-24 20:59 ` Ingo Molnar
2007-04-24 21:20 ` Andi Kleen
1 sibling, 2 replies; 33+ messages in thread
From: Daniel Walker @ 2007-04-24 20:52 UTC (permalink / raw)
To: Jeremy Fitzhardinge
Cc: Andrew Morton, Prarit Bhargava, Rick Lindsley, john stultz,
Linux Kernel, Eric Dumazet, virtualization, Chris Lalancette,
Paul Mackerras, Martin Schwidefsky, Ingo Molnar, Thomas Gleixner
On Tue, 2007-04-24 at 13:24 -0700, Jeremy Fitzhardinge wrote:
> And sched_clock's use of local_irq_save/restore appears to be absolutely
> correct, so I think it must be triggering a bug in either the self-tests
> or lockdep itself.
Why does sched_clock need to disable interrupts?
Daniel
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch 1/4] Ignore stolen time in the softlockup watchdog
2007-04-24 20:52 ` Daniel Walker
@ 2007-04-24 20:59 ` Ingo Molnar
2007-04-24 21:01 ` Daniel Walker
2007-04-24 21:14 ` Andrew Morton
2007-04-24 21:20 ` Andi Kleen
1 sibling, 2 replies; 33+ messages in thread
From: Ingo Molnar @ 2007-04-24 20:59 UTC (permalink / raw)
To: Daniel Walker
Cc: Prarit Bhargava, Rick Lindsley, john stultz, Linux Kernel,
Eric Dumazet, virtualization, Chris Lalancette, Paul Mackerras,
Martin Schwidefsky, Andrew Morton, Thomas Gleixner
* Daniel Walker <dwalker@mvista.com> wrote:
> On Tue, 2007-04-24 at 13:24 -0700, Jeremy Fitzhardinge wrote:
>
> > And sched_clock's use of local_irq_save/restore appears to be absolutely
> > correct, so I think it must be triggering a bug in either the self-tests
> > or lockdep itself.
>
> Why does sched_clock need to disable interrupts?
i concur. To me it appears not "absolutely correct" that someone
apparently added local_irq_save/restore to sched_clock(), but "absolute
madness". sched_clock() is _very_ performance-sensitive for the
scheduler, do not mess with it.
Ingo
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch 1/4] Ignore stolen time in the softlockup watchdog
2007-04-24 20:59 ` Ingo Molnar
@ 2007-04-24 21:01 ` Daniel Walker
2007-04-24 21:14 ` Andrew Morton
1 sibling, 0 replies; 33+ messages in thread
From: Daniel Walker @ 2007-04-24 21:01 UTC (permalink / raw)
To: Ingo Molnar
Cc: Jeremy Fitzhardinge, Andrew Morton, Prarit Bhargava,
Rick Lindsley, john stultz, Linux Kernel, Eric Dumazet,
virtualization, Chris Lalancette, Paul Mackerras,
Martin Schwidefsky, Thomas Gleixner
On Tue, 2007-04-24 at 22:59 +0200, Ingo Molnar wrote:
> * Daniel Walker <dwalker@mvista.com> wrote:
>
> > On Tue, 2007-04-24 at 13:24 -0700, Jeremy Fitzhardinge wrote:
> >
> > > And sched_clock's use of local_irq_save/restore appears to be absolutely
> > > correct, so I think it must be triggering a bug in either the self-tests
> > > or lockdep itself.
> >
> > Why does sched_clock need to disable interrupts?
>
> i concur. To me it appears not "absolutely correct" that someone
> apparently added local_irq_save/restore to sched_clock(), but "absolute
> madness". sched_clock() is _very_ performance-sensitive for the
> scheduler, do not mess with it.
It looks like it's used in some sort of warp check, but only when
jiffies is used .. So I'm totally stumped why it's in there..
Daniel
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch 1/4] Ignore stolen time in the softlockup watchdog
2007-04-24 20:59 ` Ingo Molnar
2007-04-24 21:01 ` Daniel Walker
@ 2007-04-24 21:14 ` Andrew Morton
1 sibling, 0 replies; 33+ messages in thread
From: Andrew Morton @ 2007-04-24 21:14 UTC (permalink / raw)
To: Ingo Molnar
Cc: Prarit Bhargava, Daniel Walker, Rick Lindsley, john stultz,
Linux Kernel, Eric Dumazet, virtualization, Chris Lalancette,
Mackerras, Martin Schwidefsky, Paul, Thomas Gleixner
On Tue, 24 Apr 2007 22:59:18 +0200 Ingo Molnar <mingo@elte.hu> wrote:
>
> * Daniel Walker <dwalker@mvista.com> wrote:
>
> > On Tue, 2007-04-24 at 13:24 -0700, Jeremy Fitzhardinge wrote:
> >
> > > And sched_clock's use of local_irq_save/restore appears to be absolutely
> > > correct, so I think it must be triggering a bug in either the self-tests
> > > or lockdep itself.
> >
> > Why does sched_clock need to disable interrupts?
>
> i concur. To me it appears not "absolutely correct" that someone
> apparently added local_irq_save/restore to sched_clock(), but "absolute
> madness". sched_clock() is _very_ performance-sensitive for the
> scheduler, do not mess with it.
Why does a local_irq_save/restore make the selftests fail??
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch 1/4] Ignore stolen time in the softlockup watchdog
2007-04-24 20:52 ` Daniel Walker
2007-04-24 20:59 ` Ingo Molnar
@ 2007-04-24 21:20 ` Andi Kleen
2007-04-24 21:33 ` Daniel Walker
1 sibling, 1 reply; 33+ messages in thread
From: Andi Kleen @ 2007-04-24 21:20 UTC (permalink / raw)
To: virtualization
Cc: Daniel Walker, Jeremy Fitzhardinge, Prarit Bhargava,
Rick Lindsley, Thomas Gleixner, john stultz, Linux Kernel,
Eric Dumazet, virtualization, Chris Lalancette, Paul Mackerras,
Martin Schwidefsky, Andrew Morton, Ingo Molnar
On Tuesday 24 April 2007 22:52:27 Daniel Walker wrote:
> On Tue, 2007-04-24 at 13:24 -0700, Jeremy Fitzhardinge wrote:
>
> > And sched_clock's use of local_irq_save/restore appears to be absolutely
> > correct, so I think it must be triggering a bug in either the self-tests
> > or lockdep itself.
>
> Why does sched_clock need to disable interrupts?
It's only used in the instable path which is kind of "i already threw up
my hands" anyways
I use it because when you transition from stable (TSC) to instable (jiffies)
the only way to avoid the clock jumping backwards is to remember and update the
last value. To avoid races with parallel cpufreq handlers or timer
interrupts this small section needs to run with interrupts disabled.
The alternative would be a seqlock, but people have complained about this
earlier too so i judged irq disabling better.
-Andi
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch 1/4] Ignore stolen time in the softlockup watchdog
2007-04-24 21:20 ` Andi Kleen
@ 2007-04-24 21:33 ` Daniel Walker
0 siblings, 0 replies; 33+ messages in thread
From: Daniel Walker @ 2007-04-24 21:33 UTC (permalink / raw)
To: Andi Kleen
Cc: virtualization, Jeremy Fitzhardinge, Prarit Bhargava,
Rick Lindsley, Thomas Gleixner, john stultz, Linux Kernel,
Eric Dumazet, virtualization, Chris Lalancette, Paul Mackerras,
Martin Schwidefsky, Andrew Morton, Ingo Molnar
On Tue, 2007-04-24 at 23:20 +0200, Andi Kleen wrote:
> On Tuesday 24 April 2007 22:52:27 Daniel Walker wrote:
> > On Tue, 2007-04-24 at 13:24 -0700, Jeremy Fitzhardinge wrote:
> >
> > > And sched_clock's use of local_irq_save/restore appears to be absolutely
> > > correct, so I think it must be triggering a bug in either the self-tests
> > > or lockdep itself.
> >
> > Why does sched_clock need to disable interrupts?
>
> It's only used in the instable path which is kind of "i already threw up
> my hands" anyways
>
> I use it because when you transition from stable (TSC) to instable (jiffies)
> the only way to avoid the clock jumping backwards is to remember and update the
> last value. To avoid races with parallel cpufreq handlers or timer
> interrupts this small section needs to run with interrupts disabled.
Preemption is already disabled with the get_cpu_var() , so it seems like
the timer interrupt is the only worry? I find it confusing that the
access of jiffies_64 isn't protected from interrupts, it's only the
per_cpu data which should already be protected by the
get_cpu_var()/put_cpu_var ..
Daniel
^ permalink raw reply [flat|nested] 33+ messages in thread
end of thread, other threads:[~2007-04-24 21:33 UTC | newest]
Thread overview: 33+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-03-27 21:49 [patch 0/4] Revised softlockup watchdog improvement patches Jeremy Fitzhardinge
2007-03-27 21:49 ` [patch 1/4] Ignore stolen time in the softlockup watchdog Jeremy Fitzhardinge
2007-04-24 6:49 ` Andrew Morton
2007-04-24 6:58 ` Jeremy Fitzhardinge
2007-04-24 7:09 ` Andrew Morton
2007-04-24 17:51 ` Jeremy Fitzhardinge
2007-04-24 17:57 ` Andrew Morton
2007-04-24 18:16 ` Jeremy Fitzhardinge
2007-04-24 18:32 ` Andrew Morton
2007-04-24 20:00 ` Jeremy Fitzhardinge
2007-04-24 20:14 ` Andrew Morton
2007-04-24 20:46 ` Jeremy Fitzhardinge
2007-04-24 20:24 ` Jeremy Fitzhardinge
2007-04-24 20:33 ` Andrew Morton
2007-04-24 20:48 ` Jeremy Fitzhardinge
2007-04-24 20:52 ` Daniel Walker
2007-04-24 20:59 ` Ingo Molnar
2007-04-24 21:01 ` Daniel Walker
2007-04-24 21:14 ` Andrew Morton
2007-04-24 21:20 ` Andi Kleen
2007-04-24 21:33 ` Daniel Walker
2007-03-27 21:49 ` [patch 2/4] percpu enable flag for " Jeremy Fitzhardinge
2007-03-27 21:49 ` [patch 3/4] Locally disable the softlockup watchdog rather than touching it Jeremy Fitzhardinge
2007-03-28 13:33 ` Prarit Bhargava
2007-03-28 13:50 ` Andi Kleen
2007-03-28 14:00 ` Prarit Bhargava
2007-03-28 14:09 ` Andi Kleen
2007-03-28 14:13 ` Prarit Bhargava
2007-03-28 14:44 ` Jeremy Fitzhardinge
2007-03-28 14:51 ` Prarit Bhargava
2007-03-28 15:22 ` Jeremy Fitzhardinge
2007-03-28 15:27 ` Prarit Bhargava
2007-03-27 21:49 ` [patch 4/4] Add global disable/enable for softlockup watchdog Jeremy Fitzhardinge
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).