public inbox for linux-ia64@vger.kernel.org
 help / color / mirror / Atom feed
* [Linux-ia64] [RFC] Remove MCA dump?
@ 2002-08-21 11:51 Matthew Wilcox
  2002-08-21 12:34 ` Andreas Schwab
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Matthew Wilcox @ 2002-08-21 11:51 UTC (permalink / raw)
  To: linux-ia64

The MCA handler is completely useless.  If I crash the machine and
forget to clear the MCA dump in firmware, at the next boot Linux dumps
the registers (in a hard-to-understand style) and hangs.  In its current
state, I'd rather it simply weren't in the kernel at all.

Do any platforms not have MCA handling facilities in firmware?

-- 
Revolutions do not require corporate support.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Linux-ia64] [RFC] Remove MCA dump?
  2002-08-21 11:51 [Linux-ia64] [RFC] Remove MCA dump? Matthew Wilcox
@ 2002-08-21 12:34 ` Andreas Schwab
  2002-08-21 12:38 ` Matthew Wilcox
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Andreas Schwab @ 2002-08-21 12:34 UTC (permalink / raw)
  To: linux-ia64

Matthew Wilcox <willy@debian.org> writes:

|> The MCA handler is completely useless.  If I crash the machine and
|> forget to clear the MCA dump in firmware, at the next boot Linux dumps
|> the registers (in a hard-to-understand style) and hangs.

I haven't seen such hangs on our machines, except for some old
pre-production systems that lack the necessary hardware support.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux AG, Deutschherrnstr. 15-19, D-90429 Nürnberg
Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Linux-ia64] [RFC] Remove MCA dump?
  2002-08-21 11:51 [Linux-ia64] [RFC] Remove MCA dump? Matthew Wilcox
  2002-08-21 12:34 ` Andreas Schwab
@ 2002-08-21 12:38 ` Matthew Wilcox
  2002-08-21 13:04 ` Erich Focht
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Matthew Wilcox @ 2002-08-21 12:38 UTC (permalink / raw)
  To: linux-ia64

On Wed, Aug 21, 2002 at 02:34:42PM +0200, Andreas Schwab wrote:
> Matthew Wilcox <willy@debian.org> writes:
> 
> |> The MCA handler is completely useless.  If I crash the machine and
> |> forget to clear the MCA dump in firmware, at the next boot Linux dumps
> |> the registers (in a hard-to-understand style) and hangs.
> 
> I haven't seen such hangs on our machines, except for some old
> pre-production systems that lack the necessary hardware support.

I wonder why not.  Here's the code:

void
init_handler_platform (struct pt_regs *regs)
{
        /* if a kernel debugger is available call it here else just dump the reg
isters */

        show_regs(regs);                /* dump the state info */
        while (1);                      /* hang city if no debugger */
}

Maybe it only locks up that specific processor so you didn't notice you'd
lost a processor on an SMP system?

-- 
Revolutions do not require corporate support.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Linux-ia64] [RFC] Remove MCA dump?
  2002-08-21 11:51 [Linux-ia64] [RFC] Remove MCA dump? Matthew Wilcox
  2002-08-21 12:34 ` Andreas Schwab
  2002-08-21 12:38 ` Matthew Wilcox
@ 2002-08-21 13:04 ` Erich Focht
  2002-08-21 14:59 ` Hall, Jenna S
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Erich Focht @ 2002-08-21 13:04 UTC (permalink / raw)
  To: linux-ia64

On Wednesday 21 August 2002 13:51, Matthew Wilcox wrote:
> The MCA handler is completely useless.  If I crash the machine and
> forget to clear the MCA dump in firmware, at the next boot Linux dumps
> the registers (in a hard-to-understand style) and hangs.  In its current
> state, I'd rather it simply weren't in the kernel at all.

I'd also like to object to the statement above. The dumped registers are
fine and enough to get an idea where the system was and what it was doing.
My machines don't hang after the reboot. I'm using kdb and LKCD so
init_handler_platform looks completely different from yours, anyhow
I don't understand why that code should be executed _after_ you reboot.
Shouldn't the MCA logs come from ia64_log_print?

Regards,
Erich



^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [Linux-ia64] [RFC] Remove MCA dump?
  2002-08-21 11:51 [Linux-ia64] [RFC] Remove MCA dump? Matthew Wilcox
                   ` (2 preceding siblings ...)
  2002-08-21 13:04 ` Erich Focht
@ 2002-08-21 14:59 ` Hall, Jenna S
  2002-08-21 16:14 ` David Mosberger
  2002-08-21 18:30 ` David Mosberger
  5 siblings, 0 replies; 7+ messages in thread
From: Hall, Jenna S @ 2002-08-21 14:59 UTC (permalink / raw)
  To: linux-ia64

Yes they should.  There is no reason your system should hang *after* the
reboot if the MCA occurred before the reboot.  I also wonder if you've got
some old hardware or firmware...please let me know what HW/FW you're running
and I'll try to figure out why your system behaves this way upon reboot.

The MCA code is certainly not finished.  The logging is there, but the
recovery for MCAs is still in development by Intel and Bull engineers.  You
can disable the MCA logging in the .config but if your HW/FW is behaving
correctly it should not matter...besides, the logs do provide valuable
information in the case of a hardware failure (eg. flaky memory DIMM causing
sporadic hardware-corrected MCAs).

Further, the init_handler_platform() procedure is only called *during* an
INIT event - which is fatal and wouldn't be helped anyway if the MCA
recovery code was perfectly healthy.  The expected behavior of this
procedure is as you described - hang if no KDB enabled, jump into KDB if it
is enabled.  There is definitely something else going on with your system
beyond a simple MCA that occurred before the reboot.

Jenna

 -----Original Message-----
From: 	Erich Focht [mailto:efocht@ess.nec.de] 
Sent:	Wednesday, August 21, 2002 6:05 AM
To:	Matthew Wilcox; linux-ia64@linuxia64.org
Subject:	Re: [Linux-ia64] [RFC] Remove MCA dump?

On Wednesday 21 August 2002 13:51, Matthew Wilcox wrote:
> The MCA handler is completely useless.  If I crash the machine and
> forget to clear the MCA dump in firmware, at the next boot Linux dumps
> the registers (in a hard-to-understand style) and hangs.  In its current
> state, I'd rather it simply weren't in the kernel at all.

I'd also like to object to the statement above. The dumped registers are
fine and enough to get an idea where the system was and what it was doing.
My machines don't hang after the reboot. I'm using kdb and LKCD so
init_handler_platform looks completely different from yours, anyhow
I don't understand why that code should be executed _after_ you reboot.
Shouldn't the MCA logs come from ia64_log_print?

Regards,
Erich


_______________________________________________
Linux-IA64 mailing list
Linux-IA64@linuxia64.org
http://lists.linuxia64.org/lists/listinfo/linux-ia64


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Linux-ia64] [RFC] Remove MCA dump?
  2002-08-21 11:51 [Linux-ia64] [RFC] Remove MCA dump? Matthew Wilcox
                   ` (3 preceding siblings ...)
  2002-08-21 14:59 ` Hall, Jenna S
@ 2002-08-21 16:14 ` David Mosberger
  2002-08-21 18:30 ` David Mosberger
  5 siblings, 0 replies; 7+ messages in thread
From: David Mosberger @ 2002-08-21 16:14 UTC (permalink / raw)
  To: linux-ia64

>>>>> On Wed, 21 Aug 2002 12:51:40 +0100, Matthew Wilcox <willy@debian.org> said:

  Matthew> The MCA handler is completely useless.

It's useful to me.

  Matthew> If I crash the machine and forget to clear the MCA dump in
  Matthew> firmware, at the next boot Linux dumps the registers (in a
  Matthew> hard-to-understand style) and hangs.

It shouldn't hang (and it doesn't on the Itanium & Itanium 2 systems
we have).  Sounds like something is off...

I agree that the output format is hard to understand, but a user-level
tool can fix that.  In my experience, the most useful value to look at
is CR[19], since it contains the IIP.  (It would be nice, though, to
have a beginners guide to decoding MCAs until there is better tools.)

	--david


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Linux-ia64] [RFC] Remove MCA dump?
  2002-08-21 11:51 [Linux-ia64] [RFC] Remove MCA dump? Matthew Wilcox
                   ` (4 preceding siblings ...)
  2002-08-21 16:14 ` David Mosberger
@ 2002-08-21 18:30 ` David Mosberger
  5 siblings, 0 replies; 7+ messages in thread
From: David Mosberger @ 2002-08-21 18:30 UTC (permalink / raw)
  To: linux-ia64

>>>>> On Wed, 21 Aug 2002 13:38:32 +0100, Matthew Wilcox <willy@debian.org> said:

  Matthew> On Wed, Aug 21, 2002 at 02:34:42PM +0200, Andreas Schwab
  Matthew> wrote:
  >> Matthew Wilcox <willy@debian.org> writes:
  >> 
  >> |> The MCA handler is completely useless.  If I crash the machine
  >> and |> forget to clear the MCA dump in firmware, at the next boot
  >> Linux dumps |> the registers (in a hard-to-understand style) and
  >> hangs.
  >> 
  >> I haven't seen such hangs on our machines, except for some old
  >> pre-production systems that lack the necessary hardware support.

  Matthew> I wonder why not.  Here's the code:

  Matthew> void init_handler_platform (struct pt_regs *regs) { /* if a
  Matthew> kernel debugger is available call it here else just dump
  Matthew> the reg isters */

  Matthew>         show_regs(regs); /* dump the state info */ while
  Matthew> (1); /* hang city if no debugger */ }

  Matthew> Maybe it only locks up that specific processor so you
  Matthew> didn't notice you'd lost a processor on an SMP system?

This endless loop happens only in response to an INIT event.  The
boot-time MCA dump does not go through this path.

	--david


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2002-08-21 18:30 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-08-21 11:51 [Linux-ia64] [RFC] Remove MCA dump? Matthew Wilcox
2002-08-21 12:34 ` Andreas Schwab
2002-08-21 12:38 ` Matthew Wilcox
2002-08-21 13:04 ` Erich Focht
2002-08-21 14:59 ` Hall, Jenna S
2002-08-21 16:14 ` David Mosberger
2002-08-21 18:30 ` David Mosberger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox