linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH]: powerpc/pseries: Print PCI slot location code on failure
@ 2006-04-28 22:42 Linas Vepstas
  2006-04-29  8:00 ` Paul Mackerras
  0 siblings, 1 reply; 2+ messages in thread
From: Linas Vepstas @ 2006-04-28 22:42 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: linuxppc-dev, linux-kernel


Paul,

A low-priority patch the improves diagnostic printing on EEH failures.
Please review and forward upstream as appropriate.

--linas

[PATCH]: powerpc/pseries: Print PCI slot location code on failure

The PCI error recovery code will printk diagnostic info when
a PCI error event occurs. Change the messages to include the slot
location code, which is how most sysadmins will know the device.

Signed-off-by: Linas Vepstas <linas@austin.ibm.com>

----
 arch/powerpc/platforms/pseries/eeh_driver.c |   33 ++++++++++++++++------------
 1 files changed, 20 insertions(+), 13 deletions(-)

Index: linux-2.6.17-rc1/arch/powerpc/platforms/pseries/eeh_driver.c
===================================================================
--- linux-2.6.17-rc1.orig/arch/powerpc/platforms/pseries/eeh_driver.c	2006-04-28 17:31:31.000000000 -0500
+++ linux-2.6.17-rc1/arch/powerpc/platforms/pseries/eeh_driver.c	2006-04-28 17:31:39.000000000 -0500
@@ -261,16 +261,20 @@ struct pci_dn * handle_eeh_events (struc
 	struct pci_bus *frozen_bus;
 	int rc = 0;
 	enum pci_ers_result result = PCI_ERS_RESULT_NONE;
-	const char *pci_str, *drv_str;
+	const char *location, *pci_str, *drv_str;
 
 	frozen_dn = find_device_pe(event->dn);
 	frozen_bus = pcibios_find_pci_bus(frozen_dn);
 
 	if (!frozen_dn) {
-		printk(KERN_ERR "EEH: Error: Cannot find partition endpoint for %s\n",
-		        pci_name(event->dev));
+
+		location = (char *) get_property(event->dn, "ibm,loc-code", NULL);
+		printk(KERN_ERR "EEH: Error: Cannot find partition endpoint "
+		                "for location=%s pci addr=%s\n",
+		        location, pci_name(event->dev));
 		return NULL;
 	}
+	location = (char *) get_property(frozen_dn, "ibm,loc-code", NULL);
 
 	/* There are two different styles for coming up with the PE.
 	 * In the old style, it was the highest EEH-capable device
@@ -282,8 +286,9 @@ struct pci_dn * handle_eeh_events (struc
 		frozen_bus = pcibios_find_pci_bus (frozen_dn->parent);
 
 	if (!frozen_bus) {
-		printk(KERN_ERR "EEH: Cannot find PCI bus for %s\n",
-		        frozen_dn->full_name);
+		printk(KERN_ERR "EEH: Cannot find PCI bus "
+		        "for location=%s dn=%s\n",
+		        location, frozen_dn->full_name);
 		return NULL;
 	}
 
@@ -318,8 +323,9 @@ struct pci_dn * handle_eeh_events (struc
 
 	eeh_slot_error_detail(frozen_pdn, 1 /* Temporary Error */);
 	printk(KERN_WARNING
-	   "EEH: This PCI device has failed %d times since last reboot: %s - %s\n",
-		frozen_pdn->eeh_freeze_count, drv_str, pci_str);
+	   "EEH: This PCI device has failed %d times since last reboot: "
+		"location=%s driver=%s pci addr=%s\n",
+		frozen_pdn->eeh_freeze_count, location, drv_str, pci_str);
 
 	/* Walk the various device drivers attached to this slot through
 	 * a reset sequence, giving each an opportunity to do what it needs
@@ -368,17 +374,18 @@ excess_failures:
 	 * due to actual, failed cards.
 	 */
 	printk(KERN_ERR
-	   "EEH: PCI device %s - %s has failed %d times \n"
-	   "and has been permanently disabled.  Please try reseating\n"
-	   "this device or replacing it.\n",
-		drv_str, pci_str, frozen_pdn->eeh_freeze_count);
+	   "EEH: PCI device at location=%s driver=%s pci addr=%s \n"
+		"has failed %d times and has been permanently disabled. \n"
+		"Please try reseating this device or replacing it.\n",
+		location, drv_str, pci_str, frozen_pdn->eeh_freeze_count);
 	goto perm_error;
 
 hard_fail:
 	printk(KERN_ERR
-	   "EEH: Unable to recover from failure of PCI device %s - %s\n"
+	   "EEH: Unable to recover from failure of PCI device "
+	   "at location=%s driver=%s pci addr=%s \n"
 	   "Please try reseating this device or replacing it.\n",
-		drv_str, pci_str);
+		location, drv_str, pci_str);
 
 perm_error:
 	eeh_slot_error_detail(frozen_pdn, 2 /* Permanent Error */);

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH]: powerpc/pseries: Print PCI slot location code on failure
  2006-04-28 22:42 [PATCH]: powerpc/pseries: Print PCI slot location code on failure Linas Vepstas
@ 2006-04-29  8:00 ` Paul Mackerras
  0 siblings, 0 replies; 2+ messages in thread
From: Paul Mackerras @ 2006-04-29  8:00 UTC (permalink / raw)
  To: Linas Vepstas; +Cc: linuxppc-dev, linux-kernel

Linas Vepstas writes:

> +		location = (char *) get_property(event->dn, "ibm,loc-code", NULL);
> +		printk(KERN_ERR "EEH: Error: Cannot find partition endpoint "
> +		                "for location=%s pci addr=%s\n",
> +		        location, pci_name(event->dev));

If location is NULL, printk will fortunately save us from a null
pointer dereference; still, it might be nice to have the message say
"location=unknown" or something rather than "location=<NULL>".

Paul.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2006-04-29  8:00 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-04-28 22:42 [PATCH]: powerpc/pseries: Print PCI slot location code on failure Linas Vepstas
2006-04-29  8:00 ` Paul Mackerras

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).