From: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
To: Linux Kernel list <linux-kernel@vger.kernel.org>,
linux-ia64@vger.kernel.org
Cc: Linas Vepstas <linas@austin.ibm.com>,
Benjamin Herrenschmidt <benh@kernel.crashing.org>,
long <tlnguyen@snoqualmie.dp.intel.com>,
linux-pci@atrey.karlin.mff.cuni.cz,
linuxppc64-dev <linuxppc64-dev@ozlabs.org>
Subject: [PATCH 08/10] IOCHK interface for I/O error handling/detecting
Date: Thu, 09 Jun 2005 22:00:44 +0900 [thread overview]
Message-ID: <42A83D7C.2010101@jp.fujitsu.com> (raw)
In-Reply-To: <42A8386F.2060100@jp.fujitsu.com>
[This is 8 of 10 patches, "iochk-08-mcadrv.patch"]
- Touching poisoned data become a MCA, so now it assumed as
a fatal error, directly will be a system down. But since
the MCA tells us a physical address - "where it happens",
we can do some action to survive.
If the address is present in resource of "check-in" device,
it is guaranteed that its driver will call iochk_read in
the very near future, and that now the driver have a
ability and responsibility of recovery from the error.
So if it was "check-in" address, what OS should do is mark
"check-in" devices and just restart usual works. Soon
the driver will notice the error and operate it properly.
Note:
We can identify a affected device, but because of SAL
behavior (mentioned at 6 of 10), we need to mark all
"check-in" devices. Fix in future, if possible.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
---
arch/ia64/kernel/mca_drv.c | 85 ++++++++++++++++++++++++++++++++++++++++++++
arch/ia64/lib/iomap_check.c | 1
2 files changed, 86 insertions(+)
Index: linux-2.6.11.11/arch/ia64/lib/iomap_check.c
===================================================================
--- linux-2.6.11.11.orig/arch/ia64/lib/iomap_check.c
+++ linux-2.6.11.11/arch/ia64/lib/iomap_check.c
@@ -145,3 +145,4 @@ void clear_bridge_error(struct pci_dev *
EXPORT_SYMBOL(iochk_read);
EXPORT_SYMBOL(iochk_clear);
+EXPORT_SYMBOL(iochk_devices); /* for MCA driver */
Index: linux-2.6.11.11/arch/ia64/kernel/mca_drv.c
===================================================================
--- linux-2.6.11.11.orig/arch/ia64/kernel/mca_drv.c
+++ linux-2.6.11.11/arch/ia64/kernel/mca_drv.c
@@ -35,6 +35,13 @@
#include "mca_drv.h"
+#ifdef CONFIG_IOMAP_CHECK
+#include <linux/pci.h>
+#include <asm/io.h>
+extern struct list_head iochk_devices;
+#endif
+
+
/* max size of SAL error record (default) */
static int sal_rec_max = 10000;
@@ -378,6 +385,80 @@ is_mca_global(peidx_table_t *peidx, pal_
return MCA_IS_GLOBAL;
}
+
+#ifdef CONFIG_IOMAP_CHECK
+
+/**
+ * get_target_identifier - get address of target_identifier
+ * @peidx: pointer of index of processor error section
+ *
+ * Return value:
+ * addr if valid / 0 if not valid
+ */
+static u64 get_target_identifier(peidx_table_t *peidx)
+{
+ sal_log_mod_error_info_t *smei;
+
+ smei = peidx_bus_check(peidx, 0);
+ if (smei->valid.target_identifier)
+ return (smei->target_identifier);
+ return 0;
+}
+
+/**
+ * offending_addr_in_check - Check if the addr is in checking resource.
+ * @addr: address offending this MCA
+ *
+ * Return value:
+ * 1 if in / 0 if out
+ */
+static int offending_addr_in_check(u64 addr)
+{
+ int i;
+ struct pci_dev *tdev;
+ iocookie *cookie;
+
+ if (list_empty(&iochk_devices))
+ return 0;
+
+ list_for_each_entry(cookie, &iochk_devices, list) {
+ tdev = cookie->dev;
+ for (i = 0; i < PCI_ROM_RESOURCE; i++) {
+ if (tdev->resource[i].start <= addr
+ && addr <= tdev->resource[i].end)
+ return 1;
+ if ((tdev->resource[i].flags
+ & (PCI_BASE_ADDRESS_SPACE|PCI_BASE_ADDRESS_MEM_TYPE_MASK))
+ == (PCI_BASE_ADDRESS_SPACE_MEMORY|PCI_BASE_ADDRESS_MEM_TYPE_64))
+ i++;
+ }
+ }
+ return 0;
+}
+
+/**
+ * pci_error_recovery - Check if MCA occur on transaction in iochk.
+ * @peidx: pointer of index of processor error section
+ *
+ * Return value:
+ * 1 if error could be cought in driver / 0 if not
+ */
+static int pci_error_recovery(peidx_table_t *peidx)
+{
+ u64 addr;
+
+ addr = get_target_identifier(peidx);
+ if(!addr) return 0;
+
+ if(offending_addr_in_check(addr))
+ return 1;
+
+ return 0;
+}
+
+#endif /* CONFIG_IOMAP_CHECK */
+
+
/**
* recover_from_read_error - Try to recover the errors which type are "read"s.
* @slidx: pointer of index of SAL error record
@@ -400,6 +481,10 @@ recover_from_read_error(slidx_table_t *s
if (!pbci->tv)
return 0;
+#ifdef CONFIG_IOMAP_CHECK
+ if ( pci_error_recovery(peidx) ) return 1;
+#endif
+
/*
* cpu read or memory-mapped io read
*
next prev parent reply other threads:[~2005-06-09 13:05 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-06-09 12:39 [PATCH 00/10] IOCHK interface for I/O error handling/detecting Hidetoshi Seto
2005-06-09 12:48 ` [PATCH 01/10] " Hidetoshi Seto
2005-06-09 16:53 ` Greg KH
2005-06-10 10:29 ` Hidetoshi Seto
2005-06-09 12:50 ` [PATCH 02/10] " Hidetoshi Seto
2005-06-09 12:51 ` [PATCH 03/10] " Hidetoshi Seto
2005-06-09 16:57 ` Greg KH
2005-06-10 10:31 ` Hidetoshi Seto
2005-06-09 17:20 ` Matthew Wilcox
2005-06-10 10:31 ` Hidetoshi Seto
2005-06-09 12:53 ` [PATCH 04/10] " Hidetoshi Seto
2005-06-09 16:57 ` Greg KH
2005-06-09 12:54 ` [PATCH 05/10] " Hidetoshi Seto
2005-06-09 12:56 ` [PATCH 06/10] " Hidetoshi Seto
2005-06-09 12:58 ` [PATCH 07/10] " Hidetoshi Seto
2005-06-09 17:40 ` David Mosberger
2005-06-10 10:29 ` Hidetoshi Seto
2005-06-10 17:25 ` David Mosberger
2005-06-13 6:54 ` Hidetoshi Seto
2005-06-09 13:00 ` Hidetoshi Seto [this message]
2005-06-09 13:02 ` [PATCH 09/10] " Hidetoshi Seto
2005-06-09 13:04 ` [PATCH 10/10] " Hidetoshi Seto
2005-06-09 16:59 ` [PATCH 00/10] " Greg KH
2005-06-09 17:13 ` Matthew Wilcox
2005-06-09 22:26 ` Benjamin Herrenschmidt
2005-06-10 10:31 ` Hidetoshi Seto
2005-06-10 10:30 ` Hidetoshi Seto
2005-06-09 17:34 ` Matthew Wilcox
2005-06-10 10:32 ` Hidetoshi Seto
2005-06-10 23:25 ` Benjamin Herrenschmidt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=42A83D7C.2010101@jp.fujitsu.com \
--to=seto.hidetoshi@jp.fujitsu.com \
--cc=benh@kernel.crashing.org \
--cc=linas@austin.ibm.com \
--cc=linux-ia64@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@atrey.karlin.mff.cuni.cz \
--cc=linuxppc64-dev@ozlabs.org \
--cc=tlnguyen@snoqualmie.dp.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox