public inbox for linux-ia64@vger.kernel.org
 help / color / mirror / Atom feed
From: Zoltan Menyhart <Zoltan.Menyhart@bull.net>
To: linux-ia64@vger.kernel.org
Subject: Re: [PATCH] New way of storing MCA/INIT logs
Date: Wed, 05 Mar 2008 13:14:52 +0000	[thread overview]
Message-ID: <47CE9CCC.1040405@bull.net> (raw)
In-Reply-To: <47CD8142.7050207@bull.net>

Thank you for your remarks.

>>The MCAs/INITs are rare.
> 
> One hopes.  :-)

Should you have a single unrecoverable MCA, the game is over.
Neither the original code, nor mine can log it before the machine
is re-booted / halted.

Only the recovered ones play.
It is safe to continue after the recovered ones.
You need these logs to be alerted and to program the maintenance.

Both the original code and mine can "swallow" about 1 recovered
event / minute, and tolerate a "burst" of 2 or IA64_MAX_MCA_INIT_BUFS
events.

The probability to have more than that _independent_ events
in a small time frame is very very low. Therefore you can
afford losing events of the same "burst".

>>There is no use wasting much permanent resources.
> 
> Sometimes a necessary evil.  Normal memory allocation routines 
> cannot be called from MCA/INIT context.

This is why I pre-allocate IA64_MAX_MCA_INIT_BUFS buffers.

> Even if the system is going down it is still nice to try to 
> go down gracefully.  Taking a system dump and logging as 
> much as possible is usefull, too.

You (may want to) take a dump if the event is not recovered.
In such e case, neither the original code, nor mine does any useful
thing :-)

> In the case where all the CPUs are INITed, what happens?
> Does this assume only one CPU at a time processes/logs records?

I have not added my code to the INIT handler yet.

From the SAL spec.: INIT reason code:

0 = Received INIT signal on this processor for reasons other than machine
     check rendezvous and CrashDump switch assertion.
1 = Received INIT signal on this processor during machine check rendezvous.
2 = Received INIT signal on this processor due to CrashDump switch assertion.

I think there is no use to log anything in the cases of MCA rendezvous
and CrashDump (that can actually dump, call the KDB).
I intend to log the "other reasons" only, by the monarch only.

>>The code does not assume that the rendezvous always works.
> 
> Could you explain.  Do you mean MCA/INIT rendezvous?

Yes.
If everything goes fine, only one CPU, the monarch logs.
(See also the comment in the INIT handler saying:
 FIXME: Workaround for broken proms that drive all INIT events as monarchs.)

However, the SAL spec. allows in "OS_MCA Hand-off State" that
"Rendezvous of other processors was required but was unsuccessful
on one or more processors."

E.g. two non-global MCAs can happen on two CPUs, both of them can start
to execute the MCA handler, thinking that each of them is monarch.
My code should survive...

Thanks,

Zoltan

  parent reply	other threads:[~2008-03-05 13:14 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-03-04 17:05 [PATCH] New way of storing MCA/INIT logs Zoltan Menyhart
2008-03-05  0:23 ` Russ Anderson
2008-03-05 13:14 ` Zoltan Menyhart [this message]
2008-03-05 16:59 ` Luck, Tony
2008-03-05 18:56 ` Russ Anderson
2008-03-05 23:38 ` Keith Owens
2008-03-06 10:24 ` Zoltan Menyhart
2008-03-06 13:14 ` Zoltan Menyhart
2008-03-06 17:09 ` Luck, Tony
2008-03-06 17:29 ` Zoltan Menyhart
2008-03-06 17:52 ` Russ Anderson
2008-03-06 21:56 ` Luck, Tony
2008-03-06 22:13 ` Russ Anderson
2008-03-07 12:02 ` Zoltan Menyhart
2008-03-07 16:55 ` Russ Anderson
2008-03-10  9:36 ` Zoltan Menyhart
2008-03-10 20:36 ` Russ Anderson
2008-03-10 21:10 ` Russ Anderson
2008-03-11 14:07 ` Zoltan Menyhart
2008-03-11 14:32 ` Robin Holt
2008-03-11 21:22 ` Russ Anderson
2008-03-12  1:08 ` Keith Owens
2008-03-12  7:42 ` Zoltan Menyhart
2008-04-01 15:18 ` [PATCH] New way of storing MCA/INIT logs - take 2 Zoltan Menyhart

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47CE9CCC.1040405@bull.net \
    --to=zoltan.menyhart@bull.net \
    --cc=linux-ia64@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox