From: Zoltan Menyhart <Zoltan.Menyhart_AT_bull.net@nospam.org>
To: linux-ia64@vger.kernel.org
Subject: Yet another MCA handler
Date: Wed, 14 Jan 2004 10:42:35 +0000 [thread overview]
Message-ID: <40051D1B.7912DB91@nospam.org> (raw)
This is the season of the MCA handlers :-)
Let me show you the one that Christian Cotte-Barrot and I wrote...
I'd like to take this opportunity to express our special thanks to Jenna
Hall, she gave us the initial version of the ".S" code and much help,
and also to Mani Ayyar, David Song and Tony Luck for the technical
consultations.
Our handler currently deals with the translation register errors only.
I was to write the code for the recovery for poisoned memory, too,
but I've got no way to provoke this kind of error
( I do not really know what it like is :-) )
The key features of our MCA handler are:
* Everything is CPU local ( an MCA data area is allocated and hooked
to each "cpuinfo" structure )
* No locks
* No rendezvous
- Does not seem to work if not all the CPUs are started up,
i.e. you specify a "maxcpus=<NUM>"...
- A failed rendezvous is a bad omen to start with
- The correctable / recoverable MCAs are CPU local businesses
- All the CPUs can handle MCAs simultaneously
* The translation registers are purged / reloaded unconditionally:
cheaper than calling SAL_GET_STATE_INFO(MCA)
* Table driven TR purging / reload (except for the kernel stack mapping)
* TRs are all purged before the reloading starts ( an erroneous TR can still
be in conflict with a freshly purged / reloaded one )
* SAL_CLEAR_STATE_INFO(MCA) is called only for MCAs which have been
corrected (TR errors). For the others, the recovery will be tempted by
a fake page fault handler, by the device drivers and by the MCA daemon,
therefore the SAL MCA log is not cleared here -- future extension :-)
* "Silent" MCA handler: no prints by default ( unless debugging )
- Output uses locks...
* A bit more serious error / status checking
This patch is against the version 2.6.1 + kdb-v4.3-2.6.1-common-b0.bz2 +
kdb-v4.3-2.6.1-ia64-b0.bz2.
Testing:
- Obviously by use of an ITP
- In my next mail I'll include a patch that can insert an illegal
translation in a TR provoking an MCA
Problems:
Neither "IA64_LOG_NEXT_BUFFER()" nor "salinfo_log_wakeup()" works :-(
I think some addresses are messed up. The system says it cannot
translate virtual address...
I'll send the patch in the next letter.
Should the list refuse it due to its length, please pick it up at our
anonymous FTP server: ftp://visibull.frec.bull.fr/pub/linux/mca/
Your remarks will be appreciated.
Zoltan Menyhart
next reply other threads:[~2004-01-14 10:42 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-01-14 10:42 Zoltan Menyhart [this message]
2004-01-14 23:30 ` Yet another MCA handler Luck, Tony
2004-01-14 23:32 ` Russ Anderson
2004-01-15 9:09 ` Zoltan Menyhart
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=40051D1B.7912DB91@nospam.org \
--to=zoltan.menyhart_at_bull.net@nospam.org \
--cc=linux-ia64@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox