public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] panic: Fix panic_timeout accuracy when running on a hypervisor
@ 2010-02-01  4:14 Anton Blanchard
  2010-02-01 22:02 ` Andrew Morton
  0 siblings, 1 reply; 3+ messages in thread
From: Anton Blanchard @ 2010-02-01  4:14 UTC (permalink / raw)
  To: Ingo Molnar, Andrew Morton; +Cc: linux-kernel


I've had some complaints about panic_timeout being wildly innacurate on
shared processor PowerPC partitions (a 3 minute panic_timeout taking 30
minutes).

The problem is we loop on mdelay(1) and with a 1ms in 10ms hypervisor
timeslice each of these will take 10ms (ie 10x) longer. I expect other
platforms with shared processor hypervisors will see the same issue.

This patch keeps the old behaviour if we have a panic_blink (only keyboard
LEDs right now) and does 1 second mdelays if we don't.

Signed-off-by: Anton Blanchard <anton@samba.org>
---

Index: linux-cpumask/kernel/panic.c
===================================================================
--- linux-cpumask.orig/kernel/panic.c	2010-02-01 14:17:40.140961595 +1100
+++ linux-cpumask/kernel/panic.c	2010-02-01 14:30:45.549711407 +1100
@@ -36,15 +36,36 @@ ATOMIC_NOTIFIER_HEAD(panic_notifier_list
 
 EXPORT_SYMBOL(panic_notifier_list);
 
-static long no_blink(long time)
-{
-	return 0;
-}
-
 /* Returns how long it waited in ms */
 long (*panic_blink)(long time);
 EXPORT_SYMBOL(panic_blink);
 
+static void panic_blink_one_second(void)
+{
+	static long i = 0, end;
+
+	if (panic_blink) {
+		end = i + MSEC_PER_SEC;
+
+		while (i < end) {
+			i += panic_blink(i);
+			mdelay(1);
+			i++;
+		}
+	} else {
+		/*
+		 * When running under a hypervisor a small mdelay may get
+		 * rounded up to the hypervisor timeslice. For example, with
+		 * a 1ms in 10ms hypervisor timeslice we might inflate a
+		 * mdelay(1) loop by 10x.
+		 *
+		 * If we have nothing to blink, spin on 1 second calls to
+		 * mdelay to avoid this.
+		 */
+		mdelay(MSEC_PER_SEC);
+	}
+}
+
 /**
  *	panic - halt the system
  *	@fmt: The text string to print
@@ -95,9 +116,6 @@ NORET_TYPE void panic(const char * fmt, 
 
 	bust_spinlocks(0);
 
-	if (!panic_blink)
-		panic_blink = no_blink;
-
 	if (panic_timeout > 0) {
 		/*
 		 * Delay timeout seconds before rebooting the machine.
@@ -105,11 +123,9 @@ NORET_TYPE void panic(const char * fmt, 
 		 */
 		printk(KERN_EMERG "Rebooting in %d seconds..", panic_timeout);
 
-		for (i = 0; i < panic_timeout*1000; ) {
+		for (i = 0; i < panic_timeout; i++) {
 			touch_nmi_watchdog();
-			i += panic_blink(i);
-			mdelay(1);
-			i++;
+			panic_blink_one_second();
 		}
 		/*
 		 * This will not be a clean reboot, with everything
@@ -135,11 +151,9 @@ NORET_TYPE void panic(const char * fmt, 
 	}
 #endif
 	local_irq_enable();
-	for (i = 0; ; ) {
+	while (1) {
 		touch_softlockup_watchdog();
-		i += panic_blink(i);
-		mdelay(1);
-		i++;
+		panic_blink_one_second();
 	}
 }
 

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] panic: Fix panic_timeout accuracy when running on a hypervisor
  2010-02-01  4:14 [PATCH] panic: Fix panic_timeout accuracy when running on a hypervisor Anton Blanchard
@ 2010-02-01 22:02 ` Andrew Morton
  2010-02-02  0:50   ` Anton Blanchard
  0 siblings, 1 reply; 3+ messages in thread
From: Andrew Morton @ 2010-02-01 22:02 UTC (permalink / raw)
  To: Anton Blanchard; +Cc: Ingo Molnar, linux-kernel

On Mon, 1 Feb 2010 15:14:30 +1100
Anton Blanchard <anton@samba.org> wrote:

> 
> I've had some complaints about panic_timeout being wildly innacurate on
> shared processor PowerPC partitions (a 3 minute panic_timeout taking 30
> minutes).
> 
> The problem is we loop on mdelay(1) and with a 1ms in 10ms hypervisor
> timeslice each of these will take 10ms (ie 10x) longer. I expect other
> platforms with shared processor hypervisors will see the same issue.
> 
> This patch keeps the old behaviour if we have a panic_blink (only keyboard
> LEDs right now) and does 1 second mdelays if we don't.
> 
> ...
>
> +static void panic_blink_one_second(void)
> +{
> +	static long i = 0, end;

I assumed the `static' was a brainfart and removed it?

> +	if (panic_blink) {
> +		end = i + MSEC_PER_SEC;
> +
> +		while (i < end) {
> +			i += panic_blink(i);
> +			mdelay(1);
> +			i++;
> +		}
> +	} else {
> +		/*
> +		 * When running under a hypervisor a small mdelay may get
> +		 * rounded up to the hypervisor timeslice. For example, with
> +		 * a 1ms in 10ms hypervisor timeslice we might inflate a
> +		 * mdelay(1) loop by 10x.
> +		 *
> +		 * If we have nothing to blink, spin on 1 second calls to
> +		 * mdelay to avoid this.
> +		 */
> +		mdelay(MSEC_PER_SEC);
> +	}
> +}

In fact we can simplify it a bit:

--- a/kernel/panic.c~panic-fix-panic_timeout-accuracy-when-running-on-a-hypervisor-fix
+++ a/kernel/panic.c
@@ -42,12 +42,10 @@ EXPORT_SYMBOL(panic_blink);
 
 static void panic_blink_one_second(void)
 {
-	static long i = 0, end;
-
 	if (panic_blink) {
-		end = i + MSEC_PER_SEC;
+		long i = 0;
 
-		while (i < end) {
+		while (i < MSEC_PER_SEC) {
 			i += panic_blink(i);
 			mdelay(1);
 			i++;
_

Why does it do the mdelay() as well as calling panic_blink()?  hm,
because the old code did.  I guess it doesn't trust panic_blink().



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] panic: Fix panic_timeout accuracy when running on a hypervisor
  2010-02-01 22:02 ` Andrew Morton
@ 2010-02-02  0:50   ` Anton Blanchard
  0 siblings, 0 replies; 3+ messages in thread
From: Anton Blanchard @ 2010-02-02  0:50 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Ingo Molnar, linux-kernel


Hi Andrew,

> > +static void panic_blink_one_second(void)
> > +{
> > +	static long i = 0, end;
> 
> I assumed the `static' was a brainfart and removed it?

...

> In fact we can simplify it a bit:
> 
> --- a/kernel/panic.c~panic-fix-panic_timeout-accuracy-when-running-on-a-hypervisor-fix
> +++ a/kernel/panic.c
> @@ -42,12 +42,10 @@ EXPORT_SYMBOL(panic_blink);
>  
>  static void panic_blink_one_second(void)
>  {
> -	static long i = 0, end;
> -
>  	if (panic_blink) {
> -		end = i + MSEC_PER_SEC;
> +		long i = 0;
>  
> -		while (i < end) {
> +		while (i < MSEC_PER_SEC) {
>  			i += panic_blink(i);
>  			mdelay(1);
>  			i++;

Unfortunately the panic_blink users seem to rely on count ever increasing:

static long i8042_panic_blink(long count)
{
...
        static long last_blink;
...
        if (count - last_blink < i8042_blink_frequency)
                return 0;

If we reset to 0 each second, this is going to always be true. Ugly interface.

Anton

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2010-02-02  0:53 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-02-01  4:14 [PATCH] panic: Fix panic_timeout accuracy when running on a hypervisor Anton Blanchard
2010-02-01 22:02 ` Andrew Morton
2010-02-02  0:50   ` Anton Blanchard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox