linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] eeh: Fixing a bug when pci structure is null
@ 2010-01-27 18:43 leitao
  2010-01-27 18:43 ` [PATCH 2/2] eeh: fixing pci_dev dependency leitao
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: leitao @ 2010-01-27 18:43 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Linas Vepstas, Breno Leitao

During a EEH recover, the pci_dev structure can be null, and currently
the kernel is crashing when pci_dev is null, with the following message:

Unable to handle kernel paging request for data at address 0x000000a0
Faulting instruction address: 0xc00000000006b8b4
Oops: Kernel access of bad area, sig: 11 [#1]

NIP [c00000000006b8b4] .eeh_event_handler+0x10c/0x1a0
LR [c00000000006b8a8] .eeh_event_handler+0x100/0x1a0
Call Trace:
[c0000003a80dff00] [c00000000006b8a8] .eeh_event_handler+0x100/0x1a0
[c0000003a80dff90] [c000000000031f1c] .kernel_thread+0x54/0x70

The bug occurs because pci_name() tries to access a null pointer.
This patch just guarantee that pci_name() is not called on Null pointers.

Signed-off-by: Breno Leitao <leitao@linux.vnet.ibm.com>
Signed-off-by: Linas Vepstas <linasvepstas@gmail.com>
---
 arch/powerpc/include/asm/eeh.h              |    1 +
 arch/powerpc/platforms/pseries/eeh.c        |    4 ++--
 arch/powerpc/platforms/pseries/eeh_driver.c |   11 +++++++++--
 arch/powerpc/platforms/pseries/eeh_event.c  |    2 +-
 4 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index 66ea9b8..f860f56 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -49,6 +49,7 @@ unsigned long eeh_check_failure(const volatile void __iomem *token,
 				unsigned long val);
 int eeh_dn_check_failure(struct device_node *dn, struct pci_dev *dev);
 void __init pci_addr_cache_build(void);
+const char *eeh_pci_name(struct pci_dev *pdev);
 
 /**
  * eeh_add_device_early
diff --git a/arch/powerpc/platforms/pseries/eeh.c b/arch/powerpc/platforms/pseries/eeh.c
index ccd8dd0..f9360fe 100644
--- a/arch/powerpc/platforms/pseries/eeh.c
+++ b/arch/powerpc/platforms/pseries/eeh.c
@@ -491,7 +491,7 @@ int eeh_dn_check_failure(struct device_node *dn, struct pci_dev *dev)
 	    pdn->eeh_mode & EEH_MODE_NOCHECK) {
 		ignored_check++;
 		pr_debug("EEH: Ignored check (%x) for %s %s\n",
-			 pdn->eeh_mode, pci_name (dev), dn->full_name);
+			 pdn->eeh_mode, eeh_pci_name (dev), dn->full_name);
 		return 0;
 	}
 
@@ -515,7 +515,7 @@ int eeh_dn_check_failure(struct device_node *dn, struct pci_dev *dev)
 			printk (KERN_ERR "EEH: %d reads ignored for recovering device at "
 				"location=%s driver=%s pci addr=%s\n",
 				pdn->eeh_check_count, location,
-				dev->driver->name, pci_name(dev));
+				dev->driver->name, eeh_pci_name(dev));
 			printk (KERN_ERR "EEH: Might be infinite loop in %s driver\n",
 				dev->driver->name);
 			dump_stack();
diff --git a/arch/powerpc/platforms/pseries/eeh_driver.c b/arch/powerpc/platforms/pseries/eeh_driver.c
index ef8e454..afdddf6 100644
--- a/arch/powerpc/platforms/pseries/eeh_driver.c
+++ b/arch/powerpc/platforms/pseries/eeh_driver.c
@@ -41,6 +41,13 @@ static inline const char * pcid_name (struct pci_dev *pdev)
 	return "";
 }
 
+inline const char *eeh_pci_name(struct pci_dev *pdev)
+{
+        if (NULL==pdev)
+                return "<null>";
+        return pci_name(pdev);
+}
+
 #if 0
 static void print_device_node_tree(struct pci_dn *pdn, int dent)
 {
@@ -337,7 +344,7 @@ struct pci_dn * handle_eeh_events (struct eeh_event *event)
 		location = location ? location : "unknown";
 		printk(KERN_ERR "EEH: Error: Cannot find partition endpoint "
 		                "for location=%s pci addr=%s\n",
-		        location, pci_name(event->dev));
+		        location, eeh_pci_name(event->dev));
 		return NULL;
 	}
 
@@ -368,7 +375,7 @@ struct pci_dn * handle_eeh_events (struct eeh_event *event)
 		pci_str = pci_name (frozen_pdn->pcidev);
 		drv_str = pcid_name (frozen_pdn->pcidev);
 	} else {
-		pci_str = pci_name (event->dev);
+		pci_str = eeh_pci_name (event->dev);
 		drv_str = pcid_name (event->dev);
 	}
 	
diff --git a/arch/powerpc/platforms/pseries/eeh_event.c b/arch/powerpc/platforms/pseries/eeh_event.c
index ddb80f5..ec5df8f 100644
--- a/arch/powerpc/platforms/pseries/eeh_event.c
+++ b/arch/powerpc/platforms/pseries/eeh_event.c
@@ -80,7 +80,7 @@ static int eeh_event_handler(void * dummy)
 	eeh_mark_slot(event->dn, EEH_MODE_RECOVERING);
 
 	printk(KERN_INFO "EEH: Detected PCI bus error on device %s\n",
-	       pci_name(event->dev));
+	       eeh_pci_name(event->dev));
 
 	pdn = handle_eeh_events(event);
 
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/2] eeh: fixing pci_dev dependency
  2010-01-27 18:43 [PATCH 1/2] eeh: Fixing a bug when pci structure is null leitao
@ 2010-01-27 18:43 ` leitao
  2010-01-29  0:04   ` Benjamin Herrenschmidt
  2010-01-27 19:09 ` [PATCH 1/2] eeh: Fixing a bug when pci structure is null Linas Vepstas
  2010-01-29  0:03 ` Benjamin Herrenschmidt
  2 siblings, 1 reply; 6+ messages in thread
From: leitao @ 2010-01-27 18:43 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Linas Vepstas, Breno Leitao

Currently pci_dev can be null when EEH is in action. This patch
just assure that we pci_dev is not NULL before calling pci_dev_put.

Signed-off-by: Breno Leitao <leitao@linux.vnet.ibm.com>
Signed-off-by: Linas Vepstas <linasvepstas@gmail.com>
---
 arch/powerpc/platforms/pseries/eeh_event.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/eeh_event.c b/arch/powerpc/platforms/pseries/eeh_event.c
index ec5df8f..7956e46 100644
--- a/arch/powerpc/platforms/pseries/eeh_event.c
+++ b/arch/powerpc/platforms/pseries/eeh_event.c
@@ -85,7 +85,8 @@ static int eeh_event_handler(void * dummy)
 	pdn = handle_eeh_events(event);
 
 	eeh_clear_slot(event->dn, EEH_MODE_RECOVERING);
-	pci_dev_put(event->dev);
+	if (event->dev)
+		pci_dev_put(event->dev);
 	kfree(event);
 	mutex_unlock(&eeh_event_mutex);
 
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] eeh: Fixing a bug when pci structure is null
  2010-01-27 18:43 [PATCH 1/2] eeh: Fixing a bug when pci structure is null leitao
  2010-01-27 18:43 ` [PATCH 2/2] eeh: fixing pci_dev dependency leitao
@ 2010-01-27 19:09 ` Linas Vepstas
  2010-01-29  0:03 ` Benjamin Herrenschmidt
  2 siblings, 0 replies; 6+ messages in thread
From: Linas Vepstas @ 2010-01-27 19:09 UTC (permalink / raw)
  To: leitao; +Cc: linuxppc-dev

Hi,

Yes, that's really my sign off: I discussed this at length with Breno.
One comment:

2010/1/27  <leitao@linux.vnet.ibm.com>:
> During a EEH recover, the pci_dev structure can be null,

It can be null when an error is detected during device config (i.e. via
pci config space access through open firmware, using the OF device
node, instead of mmio/dma access), before the kernel has created
a pci_dev structure for the device.

Oddly enough, either this slipped through the cracks all this time,
or maybe pci_name() used to protect against nulls? Not sure.

-- Linas Vepstas

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] eeh: Fixing a bug when pci structure is null
  2010-01-27 18:43 [PATCH 1/2] eeh: Fixing a bug when pci structure is null leitao
  2010-01-27 18:43 ` [PATCH 2/2] eeh: fixing pci_dev dependency leitao
  2010-01-27 19:09 ` [PATCH 1/2] eeh: Fixing a bug when pci structure is null Linas Vepstas
@ 2010-01-29  0:03 ` Benjamin Herrenschmidt
  2 siblings, 0 replies; 6+ messages in thread
From: Benjamin Herrenschmidt @ 2010-01-29  0:03 UTC (permalink / raw)
  To: leitao; +Cc: linuxppc-dev, Linas Vepstas

On Wed, 2010-01-27 at 12:43 -0600, leitao@linux.vnet.ibm.com wrote:

> diff --git a/arch/powerpc/platforms/pseries/eeh_driver.c b/arch/powerpc/platforms/pseries/eeh_driver.c
> index ef8e454..afdddf6 100644
> --- a/arch/powerpc/platforms/pseries/eeh_driver.c
> +++ b/arch/powerpc/platforms/pseries/eeh_driver.c
> @@ -41,6 +41,13 @@ static inline const char * pcid_name (struct pci_dev *pdev)
>  	return "";
>  }
>  
> +inline const char *eeh_pci_name(struct pci_dev *pdev)
> +{
> +        if (NULL==pdev)
> +                return "<null>";
> +        return pci_name(pdev);
> +}
> +

Coding style gack ... just make it static inline in the header file.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] eeh: fixing pci_dev dependency
  2010-01-27 18:43 ` [PATCH 2/2] eeh: fixing pci_dev dependency leitao
@ 2010-01-29  0:04   ` Benjamin Herrenschmidt
  2010-01-29 15:42     ` Linas Vepstas
  0 siblings, 1 reply; 6+ messages in thread
From: Benjamin Herrenschmidt @ 2010-01-29  0:04 UTC (permalink / raw)
  To: leitao; +Cc: linuxppc-dev, Linas Vepstas

On Wed, 2010-01-27 at 12:43 -0600, leitao@linux.vnet.ibm.com wrote:
> Currently pci_dev can be null when EEH is in action. This patch
> just assure that we pci_dev is not NULL before calling pci_dev_put.

Like all variants of *_put(), it already checks for a NULL argument
afaik. So that patch should be unnecessary.

Cheers,
Ben.
 
> Signed-off-by: Breno Leitao <leitao@linux.vnet.ibm.com>
> Signed-off-by: Linas Vepstas <linasvepstas@gmail.com>
> ---
>  arch/powerpc/platforms/pseries/eeh_event.c |    3 ++-
>  1 files changed, 2 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/pseries/eeh_event.c b/arch/powerpc/platforms/pseries/eeh_event.c
> index ec5df8f..7956e46 100644
> --- a/arch/powerpc/platforms/pseries/eeh_event.c
> +++ b/arch/powerpc/platforms/pseries/eeh_event.c
> @@ -85,7 +85,8 @@ static int eeh_event_handler(void * dummy)
>  	pdn = handle_eeh_events(event);
>  
>  	eeh_clear_slot(event->dn, EEH_MODE_RECOVERING);
> -	pci_dev_put(event->dev);
> +	if (event->dev)
> +		pci_dev_put(event->dev);
>  	kfree(event);
>  	mutex_unlock(&eeh_event_mutex);
>  

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] eeh: fixing pci_dev dependency
  2010-01-29  0:04   ` Benjamin Herrenschmidt
@ 2010-01-29 15:42     ` Linas Vepstas
  0 siblings, 0 replies; 6+ messages in thread
From: Linas Vepstas @ 2010-01-29 15:42 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: linuxppc-dev, leitao

On 28 January 2010 18:04, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Wed, 2010-01-27 at 12:43 -0600, leitao@linux.vnet.ibm.com wrote:
>> Currently pci_dev can be null when EEH is in action. This patch
>> just assure that we pci_dev is not NULL before calling pci_dev_put.
>
> Like all variants of *_put(), it already checks for a NULL argument
> afaik. So that patch should be unnecessary.

Ah, OK, I paniced when I saw that and assumed the worst

--linas

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2010-01-29 15:42 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-27 18:43 [PATCH 1/2] eeh: Fixing a bug when pci structure is null leitao
2010-01-27 18:43 ` [PATCH 2/2] eeh: fixing pci_dev dependency leitao
2010-01-29  0:04   ` Benjamin Herrenschmidt
2010-01-29 15:42     ` Linas Vepstas
2010-01-27 19:09 ` [PATCH 1/2] eeh: Fixing a bug when pci structure is null Linas Vepstas
2010-01-29  0:03 ` Benjamin Herrenschmidt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).