linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Nathan Lynch <ntl@pobox.com>
To: Mike Mason <mmlnx@us.ibm.com>
Cc: linasvepstas@gmail.com, paulus@samba.org, linuxppc-dev@ozlabs.org
Subject: Re: [PATCH] Don't panic when EEH_MAX_FAILS is exceeded
Date: Sun, 20 Jul 2008 15:33:33 -0500	[thread overview]
Message-ID: <20080720203333.GQ9594@localdomain> (raw)
In-Reply-To: <488383D4.9000602@us.ibm.com>

Mike Mason wrote:
>
> This patch changes the EEH_MAX_FAILS action from panic to printing
> an error message.  Panicking under under this condition is too
> harsh.  Although performance will be affected and the device may not
> recover, the system is still running, which at the very least,
> should allow for a more graceful shutdown.  The panic() is now
> wrapped in a DEBUG statement for development purposes.  The patch
> also removes the msleep() within a spinlock, which is not allowed.

> @@ -509,18 +510,24 @@

For ease of review, please try to use diff -p to generate patches.

> 	rc = 1;
> 	if (pdn->eeh_mode & EEH_MODE_ISOLATED) {
> 		pdn->eeh_check_count ++;
> -		if (pdn->eeh_check_count >= EEH_MAX_FAILS) {
> -			printk (KERN_ERR "EEH: Device driver ignored %d bad reads, panicing\n",
> -			        pdn->eeh_check_count);
> +		if (pdn->eeh_check_count % EEH_MAX_FAILS == 0) {
> +			location = (char *) of_get_property(dn, "ibm,loc-code", NULL);

Unneeded cast here, I think.

> +			printk (KERN_ERR "EEH: %d reads ignored for recovering device at "
> +				"location=%s driver=%s pci addr=%s\n",
> +				pdn->eeh_check_count, location,
> +				dev->driver->name, pci_name(dev));
> +			printk (KERN_ERR "EEH: Might be infinite loop in %s driver\n",
> +				dev->driver->name);
> +#ifdef DEBUG
> 			dump_stack();
> -			msleep(5000);
> -			
> +
> 			/* re-read the slot reset state */
> 			if (read_slot_reset_state(pdn, rets) != 0)
> 				rets[0] = -1;	/* reset state unknown */
>
> 			/* If we are here, then we hit an infinite loop. Stop. */
> 			panic("EEH: MMIO halt (%d) on device:%s\n", rets[0], pci_name(dev));
> +#endif

While I tend to agree that panic() is unnecessary, don't we want a
stack dump unconditionally (i.e. not bracketed in #ifdef DEBUG)?

I'd prefer just removing the code instead of adding #ifdef's in the
middle of this function.  eeh.c needs less #ifdef DEBUG, not more :)

  parent reply	other threads:[~2008-07-20 20:33 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-07-20 18:28 [PATCH] Don't panic when EEH_MAX_FAILS is exceeded Mike Mason
2008-07-20 18:58 ` Sean MacLennan
2008-07-20 20:17   ` Nathan Lynch
2008-07-21  3:47     ` Sean MacLennan
2008-07-21  4:09       ` Stephen Rothwell
2008-07-21 15:04         ` Sean MacLennan
2008-07-20 20:33 ` Nathan Lynch [this message]
2008-07-20 23:19   ` Linas Vepstas
2008-07-21 16:40 ` Mike Mason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080720203333.GQ9594@localdomain \
    --to=ntl@pobox.com \
    --cc=linasvepstas@gmail.com \
    --cc=linuxppc-dev@ozlabs.org \
    --cc=mmlnx@us.ibm.com \
    --cc=paulus@samba.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).