public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* 2.4. continues after Aieee...
@ 2000-11-15  7:53 Rogier Wolff
  2000-11-15 16:15 ` Dennis
  0 siblings, 1 reply; 7+ messages in thread
From: Rogier Wolff @ 2000-11-15  7:53 UTC (permalink / raw)
  To: linux-kernel


Shouldn't the system be "halted" after an "Aiee, killing interrupt
handler"?


Modem status change from 0x63 to 0xf3
Unable to handle kernel NULL pointer dereference at virtual address 00000629
 printing eip:
c4854fcc
*pde = 00000000
Oops: 0000
CPU:    0
EIP:    0010:[<c4854fcc>]
EFLAGS: 00010002
eax: 00000620   ebx: c1e80000   ecx: c1f28000   edx: 00000000
esi: c2749800   edi: 000000f3   ebp: c3ba6000   esp: c26d7dc0
ds: 0018   es: 0018   ss: 0018
Process agetty (pid: 299, stackpage=c26d7000)
Stack: c487f3e3 c3ba6578 c487f3e2 00000212 00000145 00010082 c487f3e9 00000246 
       c3ba6578 c487f3e8 c4855603 c1e80000 00000002 c3ba6000 c487f3e2 c3ba6000 
       0000000b c3ba6000 c26d7eb4 c0274400 00000002 0002001d c4859d8f c1e80000 
Call Trace: [<c487f3e3>] [<c487f3e2>] [<c487f3e9>] [<c487f3e8>] [<c4855603>] [<c487f3e2>] [<c4859d8f>] 
       [<c484f358>] [<c484f471>] [<c010b681>] [<c010b7f2>] [<c010a4e0>] [<c0116ce9>] [<c0122a4d>] [<c01233ed>] 
       [<c0123683>] [<c01235bc>] [<c014c87c>] [<c012e032>] [<c010a423>] 
Code: f6 40 09 08 0f 85 22 01 00 00 8b 86 bc 00 00 00 a8 06 0f 84 
Aiee, killing interrupt handler
Scheduling in interrupt
kernel BUG at sched.c:692!
invalid operand: 0000
CPU:    0
EIP:    0010:[<c0116019>]
EFLAGS: 00010292
eax: 0000001b   ebx: 00000000   ecx: c1f28000   edx: 00000000
esi: 00000000   edi: 0000000b   ebp: c26d7cb8   esp: c26d7c68
ds: 0018   es: 0018   ss: 0018
Process agetty (pid: 299, stackpage=c26d7000)
Stack: c01eb041 c01eb216 000002b4 c1172160 c26d6000 0000000b 00000282 c26d6000 
       00000020 00000086 00000000 c3fca000 c26d6000 c26d6000 c011a9cf c26d6000 
       c1172160 00000000 c26d6000 00000629 00000629 c011abca 00000000 00000000 
Call Trace: [<c01eb041>] [<c01eb216>] [<c011a9cf>] [<c011abca>] [<c0111a88>] [<c010a956>] [<c0111da6>] 
       [<c01ea15e>] [<c0111a88>] [<c010e586>] [<c010b681>] [<c01e16c1>] [<c01e16c1>] [<c0188b92>] [<c010a564>] 
       [<c4854fcc>] [<c487f3e3>] [<c487f3e2>] [<c487f3e9>] [<c487f3e8>] [<c4855603>] [<c487f3e2>] [<c4859d8f>] 
       [<c484f358>] [<c484f471>] [<c010b681>] [<c010b7f2>] [<c010a4e0>] [<c0116ce9>] [<c0122a4d>] [<c01233ed>] 
       [<c0123683>] [<c01235bc>] [<c014c87c>] [<c012e032>] [<c010a423>] 
Code: 0f 0b 90 8d 65 bc 5b 5e 5f 89 ec 5d c3 89 f6 55 89 e5 83 ec 
Aiee, killing interrupt handler
Scheduling in interrupt
kernel BUG at sched.c:692!
invalid operand: 0000


After this, the call trace becomes longer and longer, but the system
keeps on oopsing... 

				Roger.




-- 
** R.E.Wolff@BitWizard.nl ** http://www.BitWizard.nl/ ** +31-15-2137555 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
*       Common sense is the collection of                                *
******  prejudices acquired by age eighteen.   -- Albert Einstein ********
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.4. continues after Aieee...
  2000-11-15  7:53 2.4. continues after Aieee Rogier Wolff
@ 2000-11-15 16:15 ` Dennis
  2000-11-15 16:30   ` Rogier Wolff
  2000-11-16  4:51   ` David Feuer
  0 siblings, 2 replies; 7+ messages in thread
From: Dennis @ 2000-11-15 16:15 UTC (permalink / raw)
  To: Rogier Wolff, linux-kernel

At 02:53 AM 11/15/2000, Rogier Wolff wrote:

>Shouldn't the system be "halted" after an "Aiee, killing interrupt
>handler"?
>

This brings another question. Has there been any work done to force linux 
to reboot on all panics? Linux's propensity to crash drivers (say the 
network card driver) and leave the system running make linux unusable in 
unattended environments as the machine is functionally dead.

a simple switch that forces reboot on panic would do much to alleviate the 
problem.

DB

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.4. continues after Aieee...
  2000-11-15 16:15 ` Dennis
@ 2000-11-15 16:30   ` Rogier Wolff
  2000-11-16 11:20     ` Russell King
  2000-11-16 15:34     ` Dennis
  2000-11-16  4:51   ` David Feuer
  1 sibling, 2 replies; 7+ messages in thread
From: Rogier Wolff @ 2000-11-15 16:30 UTC (permalink / raw)
  To: Dennis; +Cc: Rogier Wolff, linux-kernel

Dennis wrote:
> At 02:53 AM 11/15/2000, Rogier Wolff wrote:
> 
> >Shouldn't the system be "halted" after an "Aiee, killing interrupt
> >handler"?
> >
> 
> This brings another question. Has there been any work done to force linux 
> to reboot on all panics? Linux's propensity to crash drivers (say the 

You already have the option to say what happens on panic. 

> network card driver) and leave the system running make linux unusable in 
> unattended environments as the machine is functionally dead.

Which doesn't help in this case, as your network card COULD be dead,
while the system simply hasn't crashed....

				Roger.



-- 
** R.E.Wolff@BitWizard.nl ** http://www.BitWizard.nl/ ** +31-15-2137555 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
*       Common sense is the collection of                                *
******  prejudices acquired by age eighteen.   -- Albert Einstein ********
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.4. continues after Aieee...
  2000-11-15 16:15 ` Dennis
  2000-11-15 16:30   ` Rogier Wolff
@ 2000-11-16  4:51   ` David Feuer
  1 sibling, 0 replies; 7+ messages in thread
From: David Feuer @ 2000-11-16  4:51 UTC (permalink / raw)
  To: linux-kernel

At 05:30 PM 11/15/2000 +0100, Rogier Wolff wrote:

> > network card driver) and leave the system running make linux unusable in
> > unattended environments as the machine is functionally dead.
>
>Which doesn't help in this case, as your network card COULD be dead,
>while the system simply hasn't crashed....

Yeah, but it doesn't matter.  The system is no more useful running with a 
network card than it is rebooting itself.  Just make sure that it doesn't 
reboot itself more than N times in M hours, and you'll be fine...   The 
network admin needs to be paged in any case. The network card COULD be 
dead, in which case the administrator needs to replace it.  Otherwise, a 
reboot could solve the problem.

--
This message has been brought to you by the letter alpha and the number pi.
Open Source: Think locally, act globally.
David Feuer
David_Feuer@brown.edu

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.4. continues after Aieee...
  2000-11-15 16:30   ` Rogier Wolff
@ 2000-11-16 11:20     ` Russell King
  2000-11-16 15:34     ` Dennis
  1 sibling, 0 replies; 7+ messages in thread
From: Russell King @ 2000-11-16 11:20 UTC (permalink / raw)
  To: Rogier Wolff; +Cc: Dennis, linux-kernel

Rogier Wolff wrote:
> Dennis wrote:
> > network card driver) and leave the system running make linux unusable in
> > unattended environments as the machine is functionally dead.
> 
> Which doesn't help in this case, as your network card COULD be dead,
> while the system simply hasn't crashed....

Not every case causes a panic either.  This week, I had an instance of
an i686 box lock solid with a DFE-530TX net card.  Rebooting/power
cycling it didn't recover it (despite it working for the past month
without any problems).  It only started working again after I moved
it into a different PCI slot.

I've seen a couple of instances now on totally different hardware where
it is possible to lock a PCI bus solid by improper connections on some
of the PCI bus lines, so a faulty PCI socket seem to be the most likely
cause.

In this case, a "panic" doesn't help you; the machine experiances a
hardware lockup.  To catch these, you'd need a hardware watchdog.

What I'm basically saying is that there is only a limited amount that
Linux (or any OS) can do against these types of hardware failure.  If
you need better protection, try a hardware with user-space policy
implementations.
   _____
  |_____| ------------------------------------------------- ---+---+-
  |   |         Russell King        rmk@arm.linux.org.uk      --- ---
  | | | | http://www.arm.linux.org.uk/personal/aboutme.html   /  /  |
  | +-+-+                                                     --- -+-
  /   |               THE developer of ARM Linux              |+| /|\
 /  | | |                                                     ---  |
    +-+-+ -------------------------------------------------  /\\\  |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.4. continues after Aieee...
  2000-11-15 16:30   ` Rogier Wolff
  2000-11-16 11:20     ` Russell King
@ 2000-11-16 15:34     ` Dennis
  2000-11-16 16:11       ` Russell King
  1 sibling, 1 reply; 7+ messages in thread
From: Dennis @ 2000-11-16 15:34 UTC (permalink / raw)
  To: Russell King, Rogier Wolff; +Cc: linux-kernel


>
>Not every case causes a panic either.  This week, I had an instance of
>an i686 box lock solid with a DFE-530TX net card.  Rebooting/power
>cycling it didn't recover it (despite it working for the past month
>without any problems).  It only started working again after I moved
>it into a different PCI slot.
>
>I've seen a couple of instances now on totally different hardware where
>it is possible to lock a PCI bus solid by improper connections on some
>of the PCI bus lines, so a faulty PCI socket seem to be the most likely
>cause.


theres nothing that software can do with a pci bus lockup. You need a 
hardware watchdog to reboot the system for this type of failure.

PCI has a very tight spec, and running a card (say on an extender) or with 
another card that has too many loads can cause a bus failure. If you have 
more than 4 cards on the bus you are out of spec, for example.

But that doesnt change the panic issue. if you have hardware problems you 
cant expect any OS to help you, you need new  hardware.

Dennis


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.4. continues after Aieee...
  2000-11-16 15:34     ` Dennis
@ 2000-11-16 16:11       ` Russell King
  0 siblings, 0 replies; 7+ messages in thread
From: Russell King @ 2000-11-16 16:11 UTC (permalink / raw)
  To: Dennis; +Cc: Rogier Wolff, linux-kernel

Dennis writes:
> >Not every case causes a panic either.  This week, I had an instance of
> >an i686 box lock solid with a DFE-530TX net card.  Rebooting/power
> >cycling it didn't recover it (despite it working for the past month
> >without any problems).  It only started working again after I moved
> >it into a different PCI slot.
> >
> >I've seen a couple of instances now on totally different hardware where
> >it is possible to lock a PCI bus solid by improper connections on some
> >of the PCI bus lines, so a faulty PCI socket seem to be the most likely
> >cause.
> 
> 
> theres nothing that software can do with a pci bus lockup. You need a 
> hardware watchdog to reboot the system for this type of failure.

If you read on, you'll discover I did in fact say this.
   _____
  |_____| ------------------------------------------------- ---+---+-
  |   |         Russell King        rmk@arm.linux.org.uk      --- ---
  | | | | http://www.arm.linux.org.uk/personal/aboutme.html   /  /  |
  | +-+-+                                                     --- -+-
  /   |               THE developer of ARM Linux              |+| /|\
 /  | | |                                                     ---  |
    +-+-+ -------------------------------------------------  /\\\  |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2000-11-16 16:41 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2000-11-15  7:53 2.4. continues after Aieee Rogier Wolff
2000-11-15 16:15 ` Dennis
2000-11-15 16:30   ` Rogier Wolff
2000-11-16 11:20     ` Russell King
2000-11-16 15:34     ` Dennis
2000-11-16 16:11       ` Russell King
2000-11-16  4:51   ` David Feuer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox