[PATCH 2.6.13-rc1 07/10] IOCHK interface for I/O error handling/detecting

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
To: Linux Kernel list <linux-kernel@vger.kernel.org>,
	linux-ia64@vger.kernel.org, "Luck, Tony" <tony.luck@intel.com>
Cc: Linas Vepstas <linas@austin.ibm.com>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	long <tlnguyen@snoqualmie.dp.intel.com>,
	linux-pci@atrey.karlin.mff.cuni.cz,
	linuxppc64-dev <linuxppc64-dev@ozlabs.org>
Subject: [PATCH 2.6.13-rc1 07/10] IOCHK interface for I/O error handling/detecting
Date: Wed, 06 Jul 2005 14:17:21 +0900	[thread overview]
Message-ID: <42CB6961.2060508@jp.fujitsu.com> (raw)
In-Reply-To: <42CB63B2.6000505@jp.fujitsu.com>

[This is 7 of 10 patches, "iochk-07-poison.patch"]

- When bus-error occur on write, write data is broken on
   the bus, so target device gets broken data.

   There are 2 way for such device to take:
    - send PERR(Parity Error) to host, expecting immediate panic.
    - mark status register as error, expecting its driver to read
      it and decide to retry.

   So it is not difficult for drivers to recover from error on
   write if it can take latter way, and if it don't worry about
   taking time to wait completion of write.

- When bus-error occur on read, read data is broken on
   the bus, so host bridge gets broken data.

   There are 2 way for such bridge to take:
    - send BERR(Bus Error) to host, expecting immediate panic.
    - mark data as "poisoned" and throw it to destination,
      expecting panic if system touched it but cannot stop data
      pollution.

   Former is traditional way, latter is modern way, called
   "data poisoning". The important difference is whether OS
   can get a chance to recover from the error.
   Usually, sending BERR doesn't tell us "where it comes",
   "who it orders", so we cannot do anything except panic.
   In the other hand, poisoned data will reach its destination
   and will cause a error on there again. Yes, destination is
   "where who lives".

   Well, the idea is quite simple:
    "driver checks read data, and recover if it was poisoned."

   Checking all read at once (ex. take a memo of all read
   addresses touched after iochk_clear and check them all in
   iochk_read) does not make sense. Practical way is check
   each read, keep its result, and read it at end.

Touching poisoned data become a MCA, so now it directly means
a system down. But since the MCA tells us "where it happens",
we can recover it...? All right, let's see next (8 of 10).

Changes from previous one for 2.6.11.11:
   - move barrier function macro into gcc_inirin.h.
   - could anyone write same barrier for intel compiler?
     Tony or David, could you help me?

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>

---

  include/asm-ia64/gcc_intrin.h |   16 +++++++
  include/asm-ia64/io.h         |   96 ++++++++++++++++++++++++++++++++++++++++++
  2 files changed, 112 insertions(+)

Index: linux-2.6.13-rc1/include/asm-ia64/io.h
===================================================================
--- linux-2.6.13-rc1.orig/include/asm-ia64/io.h
+++ linux-2.6.13-rc1/include/asm-ia64/io.h
@@ -189,6 +189,8 @@ __ia64_mk_io_addr (unsigned long port)
   * during optimization, which is why we use "volatile" pointers.
   */

+#ifdef CONFIG_IOMAP_CHECK
+
  static inline unsigned int
  ___ia64_inb (unsigned long port)
  {
@@ -197,6 +199,8 @@ ___ia64_inb (unsigned long port)

  	ret = *addr;
  	__ia64_mf_a();
+	ia64_mca_barrier(ret);
+
  	return ret;
  }

@@ -208,6 +212,8 @@ ___ia64_inw (unsigned long port)

  	ret = *addr;
  	__ia64_mf_a();
+	ia64_mca_barrier(ret);
+
  	return ret;
  }

@@ -219,9 +225,48 @@ ___ia64_inl (unsigned long port)

  	ret = *addr;
  	__ia64_mf_a();
+	ia64_mca_barrier(ret);
+
+	return ret;
+}
+
+#else /* CONFIG_IOMAP_CHECK */
+
+static inline unsigned int
+___ia64_inb (unsigned long port)
+{
+	volatile unsigned char *addr = __ia64_mk_io_addr(port);
+	unsigned char ret;
+
+	ret = *addr;
+	__ia64_mf_a();
+	return ret;
+}
+
+static inline unsigned int
+___ia64_inw (unsigned long port)
+{
+	volatile unsigned short *addr = __ia64_mk_io_addr(port);
+	unsigned short ret;
+
+	ret = *addr;
+	__ia64_mf_a();
  	return ret;
  }

+static inline unsigned int
+___ia64_inl (unsigned long port)
+{
+	volatile unsigned int *addr = __ia64_mk_io_addr(port);
+	unsigned int ret;
+
+	ret = *addr;
+	__ia64_mf_a();
+	return ret;
+}
+
+#endif /* CONFIG_IOMAP_CHECK */
+
  static inline void
  ___ia64_outb (unsigned char val, unsigned long port)
  {
@@ -338,6 +383,55 @@ __outsl (unsigned long port, const void
   * a good idea).  Writes are ok though for all existing ia64 platforms (and
   * hopefully it'll stay that way).
   */
+
+#ifdef CONFIG_IOMAP_CHECK
+
+static inline unsigned char
+___ia64_readb (const volatile void __iomem *addr)
+{
+	unsigned char val;
+
+	val = *(volatile unsigned char __force *)addr;
+	ia64_mca_barrier(val);
+
+	return val;
+}
+
+static inline unsigned short
+___ia64_readw (const volatile void __iomem *addr)
+{
+	unsigned short val;
+
+	val = *(volatile unsigned short __force *)addr;
+	ia64_mca_barrier(val);
+
+	return val;
+}
+
+static inline unsigned int
+___ia64_readl (const volatile void __iomem *addr)
+{
+	unsigned int val;
+
+	val = *(volatile unsigned int __force *) addr;
+	ia64_mca_barrier(val);
+
+	return val;
+}
+
+static inline unsigned long
+___ia64_readq (const volatile void __iomem *addr)
+{
+	unsigned long val;
+
+	val = *(volatile unsigned long __force *) addr;
+	ia64_mca_barrier(val);
+
+	return val;
+}
+
+#else /* CONFIG_IOMAP_CHECK */
+
  static inline unsigned char
  ___ia64_readb (const volatile void __iomem *addr)
  {
@@ -362,6 +456,8 @@ ___ia64_readq (const volatile void __iom
  	return *(volatile unsigned long __force *) addr;
  }

+#endif /* CONFIG_IOMAP_CHECK */
+
  static inline void
  __writeb (unsigned char val, volatile void __iomem *addr)
  {
Index: linux-2.6.13-rc1/include/asm-ia64/gcc_intrin.h
===================================================================
--- linux-2.6.13-rc1.orig/include/asm-ia64/gcc_intrin.h
+++ linux-2.6.13-rc1/include/asm-ia64/gcc_intrin.h
@@ -598,4 +598,20 @@ do {								\
  		      :: "r"((x)) : "p6", "p7", "memory");	\
  } while (0)

+/*
+ * Some I/O bridges may poison the data read, instead of
+ * signaling a BERR. The consummation of poisoned data
+ * triggers a MCA, which tells us the polluted address.
+ * Note that the read operation by itself does not consume
+ * the bad data, you have to do something with it, e.g.:
+ *
+ *	ld.8	r9=[r10];;	// r10 == I/O address
+ *	add.8	r8=r9,0;;	// fake operation
+ */
+#define ia64_mca_barrier(val)					\
+({								\
+	register unsigned long gr8 asm("r8");			\
+        asm volatile ("add %0=%1,r0" : "=r"(gr8) : "r"(val)); 	\
+})
+
  #endif /* _ASM_IA64_GCC_INTRIN_H */

next prev parent reply	other threads:[~2005-07-06  6:45 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-07-06  4:53 [PATCH 2.6.13-rc1 01/10] IOCHK interface for I/O error handling/detecting Hidetoshi Seto
2005-07-06  5:00 ` [PATCH 2.6.13-rc1 02/10] " Hidetoshi Seto
2005-07-06  5:04 ` [PATCH 2.6.13-rc1 03/10] " Hidetoshi Seto
2005-07-12 19:51   ` Linas Vepstas
2005-07-13  0:18     ` Benjamin Herrenschmidt
2005-07-13 22:42       ` Linas Vepstas
2005-07-13  1:33     ` Hidetoshi Seto
2005-07-06  5:07 ` [PATCH 2.6.13-rc1 04/10] " Hidetoshi Seto
2005-07-06  5:11 ` [PATCH 2.6.13-rc1 05/10] " Hidetoshi Seto
2005-07-18 19:21   ` Grant Grundler
2005-07-06  5:14 ` [PATCH 2.6.13-rc1 06/10] " Hidetoshi Seto
2005-07-06  5:17 ` Hidetoshi Seto [this message]
2005-07-08  4:37   ` [PATCH 2.6.13-rc1 07/10] " david mosberger
2005-07-08  5:44     ` Hidetoshi Seto
2005-07-12 21:14   ` Linas Vepstas
2005-07-13  2:00     ` Hidetoshi Seto
2005-07-06  5:18 ` [PATCH 2.6.13-rc1 08/10] " Hidetoshi Seto
2005-07-12 22:22   ` Linas Vepstas
2005-07-13  1:36     ` Hidetoshi Seto
2005-07-06  5:20 ` [PATCH 2.6.13-rc1 09/10] " Hidetoshi Seto
2005-07-06  5:21 ` [PATCH 2.6.13-rc1 10/10] " Hidetoshi Seto
2005-07-06  6:26 ` [PATCH 2.6.13-rc1 01/10] " YOSHIFUJI Hideaki / 吉藤英明
2005-07-06 10:15   ` Hidetoshi Seto
2005-07-07 18:41 ` Greg KH
2005-07-07 22:27   ` Benjamin Herrenschmidt
2005-07-08 12:22     ` Hidetoshi Seto

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=42CB6961.2060508@jp.fujitsu.com \
    --to=seto.hidetoshi@jp.fujitsu.com \
    --cc=benh@kernel.crashing.org \
    --cc=linas@austin.ibm.com \
    --cc=linux-ia64@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@atrey.karlin.mff.cuni.cz \
    --cc=linuxppc64-dev@ozlabs.org \
    --cc=tlnguyen@snoqualmie.dp.intel.com \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox