linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Mike Mason <mmlnx@us.ibm.com>
To: paulus@samba.org, benh@kernel.crashing.org,
	linasvepstas@gmail.com, linuxppc-dev@ozlabs.org
Subject: Re: [PATCH] Don't panic when EEH_MAX_FAILS is exceeded
Date: Mon, 21 Jul 2008 09:40:17 -0700	[thread overview]
Message-ID: <4884BBF1.2060708@us.ibm.com> (raw)
In-Reply-To: <488383D4.9000602@us.ibm.com>

Here's a repost of the patch with the suggested changes.

This patch changes the EEH_MAX_FAILS action from panic to printing an 
error message.  Panicking under under this condition is too harsh.  
Although performance will be affected and the device may not recover, 
the system is still running, which at the very least should allow for a 
more graceful shutdown. The patch also removes the msleep() within a 
spinlock, which can lead to a deadlock and is not recommended.

Signed-off-by: Mike Mason <mmlnx@us.ibm.com>
Acked-by: Linas Vepstas <linasvepstas@gmail.com>

--- powerpc.git/arch/powerpc/platforms/pseries/eeh.c	2008-07-18 08:51:42.000000000 -0700
+++ powerpc.git-new/arch/powerpc/platforms/pseries/eeh.c	2008-07-21 03:25:43.000000000 -0700
@@ -75,9 +75,9 @@
  */
 
 /* If a device driver keeps reading an MMIO register in an interrupt
- * handler after a slot isolation event has occurred, we assume it
- * is broken and panic.  This sets the threshold for how many read
- * attempts we allow before panicking.
+ * handler after a slot isolation event, it might be broken.
+ * This sets the threshold for how many read attempts we allow
+ * before printing an error message.
  */
 #define EEH_MAX_FAILS	2100000
 
@@ -470,6 +470,7 @@ int eeh_dn_check_failure(struct device_n
 	unsigned long flags;
 	struct pci_dn *pdn;
 	int rc = 0;
+	const char *location;
 
 	total_mmio_ffs++;
 
@@ -509,18 +510,15 @@ int eeh_dn_check_failure(struct device_n
 	rc = 1;
 	if (pdn->eeh_mode & EEH_MODE_ISOLATED) {
 		pdn->eeh_check_count ++;
-		if (pdn->eeh_check_count >= EEH_MAX_FAILS) {
-			printk (KERN_ERR "EEH: Device driver ignored %d bad reads, panicing\n",
-			        pdn->eeh_check_count);
+		if (pdn->eeh_check_count % EEH_MAX_FAILS == 0) {
+			location = of_get_property(dn, "ibm,loc-code", NULL);
+			printk (KERN_ERR "EEH: %d reads ignored for recovering device at "
+				"location=%s driver=%s pci addr=%s\n",
+				pdn->eeh_check_count, location,
+				dev->driver->name, pci_name(dev));
+			printk (KERN_ERR "EEH: Might be infinite loop in %s driver\n",
+				dev->driver->name);
 			dump_stack();
-			msleep(5000);
-			
-			/* re-read the slot reset state */
-			if (read_slot_reset_state(pdn, rets) != 0)
-				rets[0] = -1;	/* reset state unknown */
-
-			/* If we are here, then we hit an infinite loop. Stop. */
-			panic("EEH: MMIO halt (%d) on device:%s\n", rets[0], pci_name(dev));
 		}
 		goto dn_unlock;
 	}

      parent reply	other threads:[~2008-07-21 16:40 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-07-20 18:28 [PATCH] Don't panic when EEH_MAX_FAILS is exceeded Mike Mason
2008-07-20 18:58 ` Sean MacLennan
2008-07-20 20:17   ` Nathan Lynch
2008-07-21  3:47     ` Sean MacLennan
2008-07-21  4:09       ` Stephen Rothwell
2008-07-21 15:04         ` Sean MacLennan
2008-07-20 20:33 ` Nathan Lynch
2008-07-20 23:19   ` Linas Vepstas
2008-07-21 16:40 ` Mike Mason [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4884BBF1.2060708@us.ibm.com \
    --to=mmlnx@us.ibm.com \
    --cc=benh@kernel.crashing.org \
    --cc=linasvepstas@gmail.com \
    --cc=linuxppc-dev@ozlabs.org \
    --cc=paulus@samba.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).