All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch] i386: halt the CPU on serious errors
@ 2006-06-20  4:55 Chuck Ebbert
  2006-06-20  5:09 ` Dave Jones
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Chuck Ebbert @ 2006-06-20  4:55 UTC (permalink / raw)
  To: linux-kernel; +Cc: Andreas Mohr, Andrew Morton

Halt the CPU when serious errors are encountered and we
deliberately go into an infinite loop.

Suggested by Andreas Mohr.

Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>

--- 2.6.17-32.orig/arch/i386/kernel/crash.c
+++ 2.6.17-32/arch/i386/kernel/crash.c
@@ -113,8 +113,8 @@ static int crash_nmi_callback(struct pt_
 	disable_local_APIC();
 	atomic_dec(&waiting_for_crash_ipi);
 	/* Assume hlt works */
-	halt();
-	for(;;);
+	for (;;)
+		halt();
 
 	return 1;
 }
--- 2.6.17-32.orig/arch/i386/kernel/doublefault.c
+++ 2.6.17-32/arch/i386/kernel/doublefault.c
@@ -44,7 +44,8 @@ static void doublefault_fn(void)
 		}
 	}
 
-	for (;;) /* nothing */;
+	for (;;)
+		halt();
 }
 
 struct tss_struct doublefault_tss __cacheline_aligned = {
-- 
Chuck
 "You can't read a newspaper if you can't read."  --George W. Bush

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch] i386: halt the CPU on serious errors
  2006-06-20  4:55 [patch] i386: halt the CPU on serious errors Chuck Ebbert
@ 2006-06-20  5:09 ` Dave Jones
  2006-06-20  7:50 ` Andreas Mohr
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: Dave Jones @ 2006-06-20  5:09 UTC (permalink / raw)
  To: Chuck Ebbert; +Cc: linux-kernel, Andreas Mohr, Andrew Morton

On Tue, Jun 20, 2006 at 12:55:25AM -0400, Chuck Ebbert wrote:
 
 > --- 2.6.17-32.orig/arch/i386/kernel/crash.c
 > +++ 2.6.17-32/arch/i386/kernel/crash.c
 > @@ -113,8 +113,8 @@ static int crash_nmi_callback(struct pt_
 >  	disable_local_APIC();
 >  	atomic_dec(&waiting_for_crash_ipi);
 >  	/* Assume hlt works */
 > -	halt();
 > -	for(;;);
 > +	for (;;)
 > +		halt();
 >  
 >  	return 1;

But we should never get past that first halt(), as interrupts are disabled.

 > --- 2.6.17-32.orig/arch/i386/kernel/doublefault.c
 > +++ 2.6.17-32/arch/i386/kernel/doublefault.c
 > @@ -44,7 +44,8 @@ static void doublefault_fn(void)
 >  		}
 >  	}
 >  
 > -	for (;;) /* nothing */;
 > +	for (;;)
 > +		halt();
 >  }

This one would probably be better off as a cpu_relax()

		Dave

-- 
http://www.codemonkey.org.uk

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch] i386: halt the CPU on serious errors
  2006-06-20  4:55 [patch] i386: halt the CPU on serious errors Chuck Ebbert
  2006-06-20  5:09 ` Dave Jones
@ 2006-06-20  7:50 ` Andreas Mohr
  2006-06-21  9:20 ` Andi Kleen
  2006-06-21  9:26 ` Arjan van de Ven
  3 siblings, 0 replies; 7+ messages in thread
From: Andreas Mohr @ 2006-06-20  7:50 UTC (permalink / raw)
  To: Chuck Ebbert; +Cc: linux-kernel, Andrew Morton

Hi,

On Tue, Jun 20, 2006 at 12:55:25AM -0400, Chuck Ebbert wrote:
> Halt the CPU when serious errors are encountered and we
> deliberately go into an infinite loop.
> 
> Suggested by Andreas Mohr.

Thanks for the very fast patch, but:

>       /* Assume hlt works */
> -     halt();
> -     for(;;);
> +     for (;;)
> +             halt();

The comment above seems to hint at this code in arch/i386/kernel/smp.c/
stop_this_cpu() which does proper hlt checks:

        if (cpu_data[smp_processor_id()].hlt_works_ok)
                for(;;) halt();
        for (;;);

So I'm unsure what happens if hlt is not supported properly. Maybe
there's an invalid opcode exception in a loop then.

I think a patch should do the following (or more):

- try to group various CPU emergency stop places together
- comment about trying to avoid overheating, mentioning ACPI
  thermal protection specifications and possibly *missing protection*
  in non-ACPI systems/modes (APM!)
- if (hlt_works_ok), do hlt loop
- if not, do a cpu_relax() loop (or even: for (;;) 3 times cpu_relax()
  for lower activity?)

One thing that may be worth pondering about is whether using cpu_relax()
might actually keep the CPU slightly below the ACPI shutdown temperature
limit and thus do *more* harm with a broken fan. In that case we might
want to check for active ACPI mode and if active do a busy loop instead.
This however may cause temperature to cycle (will a CPU shutdown reactivate
itself after cooling down again?), so a cpu_relax() might still be better.

Andreas Mohr

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch] i386: halt the CPU on serious errors
@ 2006-06-21  9:17 Chuck Ebbert
  0 siblings, 0 replies; 7+ messages in thread
From: Chuck Ebbert @ 2006-06-21  9:17 UTC (permalink / raw)
  To: Dave Jones; +Cc: Andrew Morton, Andreas Mohr, linux-kernel

In-Reply-To: <20060620050910.GA6091@redhat.com>

On Tue, 20 Jun 2006 01:09:10 -0400, Dave Jones wrote:

> > --- 2.6.17-32.orig/arch/i386/kernel/crash.c
> > +++ 2.6.17-32/arch/i386/kernel/crash.c
> > @@ -113,8 +113,8 @@ static int crash_nmi_callback(struct pt_
> >     disable_local_APIC();
> >     atomic_dec(&waiting_for_crash_ipi);
> >     /* Assume hlt works */
> > -   halt();
> > -   for(;;);
> > +   for (;;)
> > +           halt();
> >  
> >     return 1;
>
> But we should never get past that first halt(), as interrupts are disabled.

Yeah, but why not do it anyway?  I'll change the comment.

> > --- 2.6.17-32.orig/arch/i386/kernel/doublefault.c
> > +++ 2.6.17-32/arch/i386/kernel/doublefault.c
> > @@ -44,7 +44,8 @@ static void doublefault_fn(void)
> >             }
> >     }
> >  
> > -   for (;;) /* nothing */;
> > +   for (;;)
> > +           halt();
> >  }
>
> This one would probably be better off as a cpu_relax()

OK.

-- 
Chuck
 "You can't read a newspaper if you can't read."  --George W. Bush

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch] i386: halt the CPU on serious errors
@ 2006-06-21  9:17 Chuck Ebbert
  0 siblings, 0 replies; 7+ messages in thread
From: Chuck Ebbert @ 2006-06-21  9:17 UTC (permalink / raw)
  To: Andreas Mohr; +Cc: Andrew Morton, linux-kernel

In-Reply-To: <20060620075015.GB3030@rhlx01.fht-esslingen.de>

On Tue, 20 Jun 2006 09:50:15 +0200, Andreas Mohr wrote:

> Thanks for the very fast patch, but:
> 
> >       /* Assume hlt works */
> > -     halt();
> > -     for(;;);
> > +     for (;;)
> > +             halt();
> 
> The comment above seems to hint at this code in arch/i386/kernel/smp.c/
> stop_this_cpu() which does proper hlt checks:
> 
>         if (cpu_data[smp_processor_id()].hlt_works_ok)
>                 for(;;) halt();
>         for (;;);
> 
> So I'm unsure what happens if hlt is not supported properly. Maybe
> there's an invalid opcode exception in a loop then.

  Some ancient CPUs never wake up on interrupt after a halt.  See
include/asm-i386/bugs.h::check_hlt().

-- 
Chuck
 "You can't read a newspaper if you can't read."  --George W. Bush

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch] i386: halt the CPU on serious errors
  2006-06-20  4:55 [patch] i386: halt the CPU on serious errors Chuck Ebbert
  2006-06-20  5:09 ` Dave Jones
  2006-06-20  7:50 ` Andreas Mohr
@ 2006-06-21  9:20 ` Andi Kleen
  2006-06-21  9:26 ` Arjan van de Ven
  3 siblings, 0 replies; 7+ messages in thread
From: Andi Kleen @ 2006-06-21  9:20 UTC (permalink / raw)
  To: Chuck Ebbert; +Cc: Andreas Mohr, Andrew Morton, linux-kernel

Chuck Ebbert <76306.1226@compuserve.com> writes:

> Halt the CPU when serious errors are encountered and we
> deliberately go into an infinite loop.

You need to check first if the CPU supports halt. 

-Andi

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch] i386: halt the CPU on serious errors
  2006-06-20  4:55 [patch] i386: halt the CPU on serious errors Chuck Ebbert
                   ` (2 preceding siblings ...)
  2006-06-21  9:20 ` Andi Kleen
@ 2006-06-21  9:26 ` Arjan van de Ven
  3 siblings, 0 replies; 7+ messages in thread
From: Arjan van de Ven @ 2006-06-21  9:26 UTC (permalink / raw)
  To: Chuck Ebbert; +Cc: linux-kernel, Andreas Mohr, Andrew Morton

On Tue, 2006-06-20 at 00:55 -0400, Chuck Ebbert wrote:
> Halt the CPU when serious errors are encountered and we
> deliberately go into an infinite loop.
> 
> Suggested by Andreas Mohr.
> 
> Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
> 
> --- 2.6.17-32.orig/arch/i386/kernel/crash.c
> +++ 2.6.17-32/arch/i386/kernel/crash.c
> @@ -113,8 +113,8 @@ static int crash_nmi_callback(struct pt_
>  	disable_local_APIC();
>  	atomic_dec(&waiting_for_crash_ipi);
>  	/* Assume hlt works */
> -	halt();
> -	for(;;);
> +	for (;;)
> +		halt();

if halt is fall through (as you suggest).. this would want a cpu_relax()
as well..


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2006-06-21  9:26 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-06-20  4:55 [patch] i386: halt the CPU on serious errors Chuck Ebbert
2006-06-20  5:09 ` Dave Jones
2006-06-20  7:50 ` Andreas Mohr
2006-06-21  9:20 ` Andi Kleen
2006-06-21  9:26 ` Arjan van de Ven
  -- strict thread matches above, loose matches on Subject: below --
2006-06-21  9:17 Chuck Ebbert
2006-06-21  9:17 Chuck Ebbert

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.