* [PATCH] powerpc/eeh: crash caused by null eeh_dev
@ 2012-04-17 5:55 Gavin Shan
2012-04-18 1:16 ` Anton Blanchard
0 siblings, 1 reply; 2+ messages in thread
From: Gavin Shan @ 2012-04-17 5:55 UTC (permalink / raw)
To: linuxppc-dev; +Cc: anton, Gavin Shan
The problem was reported by Anton Blanchard. While EEH error
happened to the PCI device without the corresponding device
driver, kernel crash was seen. Eventually, I successfully
reproduced the problem on Firebird-L machine with utility
"errinjct". Initially, the device driver for Emulex ethernet
MAC has been disabled from .config and force data parity on
the Emulex ethernet MAC with help of "errinjct". Eventually,
I saw the kernel crash after issueing couple of "lspci -v"
command.
The root cause behind is that the PCI device, including the
reference to the corresponding eeh device, will be removed
from the system while EEH does recovery. Afterwards, the
PCI device will be probed again and added into the system
accordingly. So it's not safe to retrieve the eeh device from
the corresponding PCI device after the PCI device has been removed
and not added again.
The patch fixes the issue and retrieve the eeh device from OF node
instead of PCI device after the PCI device has been removed.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
---
arch/powerpc/platforms/pseries/eeh.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/arch/powerpc/platforms/pseries/eeh.c b/arch/powerpc/platforms/pseries/eeh.c
index 309d38e..a75e37d 100644
--- a/arch/powerpc/platforms/pseries/eeh.c
+++ b/arch/powerpc/platforms/pseries/eeh.c
@@ -1076,7 +1076,7 @@ static void eeh_add_device_late(struct pci_dev *dev)
pr_debug("EEH: Adding device %s\n", pci_name(dev));
dn = pci_device_to_OF_node(dev);
- edev = pci_dev_to_eeh_dev(dev);
+ edev = of_node_to_eeh_dev(dn);
if (edev->pdev == dev) {
pr_debug("EEH: Already referenced !\n");
return;
--
1.7.5.4
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH] powerpc/eeh: crash caused by null eeh_dev
2012-04-17 5:55 [PATCH] powerpc/eeh: crash caused by null eeh_dev Gavin Shan
@ 2012-04-18 1:16 ` Anton Blanchard
0 siblings, 0 replies; 2+ messages in thread
From: Anton Blanchard @ 2012-04-18 1:16 UTC (permalink / raw)
To: Gavin Shan; +Cc: linuxppc-dev
Hi Gavin,
> The problem was reported by Anton Blanchard. While EEH error
> happened to the PCI device without the corresponding device
> driver, kernel crash was seen. Eventually, I successfully
> reproduced the problem on Firebird-L machine with utility
> "errinjct". Initially, the device driver for Emulex ethernet
> MAC has been disabled from .config and force data parity on
> the Emulex ethernet MAC with help of "errinjct". Eventually,
> I saw the kernel crash after issueing couple of "lspci -v"
> command.
>
> The root cause behind is that the PCI device, including the
> reference to the corresponding eeh device, will be removed
> from the system while EEH does recovery. Afterwards, the
> PCI device will be probed again and added into the system
> accordingly. So it's not safe to retrieve the eeh device from
> the corresponding PCI device after the PCI device has been removed
> and not added again.
>
> The patch fixes the issue and retrieve the eeh device from OF node
> instead of PCI device after the PCI device has been removed.
Thanks, this does fix the oops I see.
Tested-by: Anton Blanchard <anton@samba.org>
Anton
> Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
> ---
> arch/powerpc/platforms/pseries/eeh.c | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/arch/powerpc/platforms/pseries/eeh.c
> b/arch/powerpc/platforms/pseries/eeh.c index 309d38e..a75e37d 100644
> --- a/arch/powerpc/platforms/pseries/eeh.c
> +++ b/arch/powerpc/platforms/pseries/eeh.c
> @@ -1076,7 +1076,7 @@ static void eeh_add_device_late(struct pci_dev
> *dev) pr_debug("EEH: Adding device %s\n", pci_name(dev));
>
> dn = pci_device_to_OF_node(dev);
> - edev = pci_dev_to_eeh_dev(dev);
> + edev = of_node_to_eeh_dev(dn);
> if (edev->pdev == dev) {
> pr_debug("EEH: Already referenced !\n");
> return;
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2012-04-18 1:16 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-04-17 5:55 [PATCH] powerpc/eeh: crash caused by null eeh_dev Gavin Shan
2012-04-18 1:16 ` Anton Blanchard
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).