From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 3znNqL00XXzF0ny for ; Fri, 23 Feb 2018 05:56:13 +1100 (AEDT) Received: from pps.filterd (m0098394.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w1MIqViA146322 for ; Thu, 22 Feb 2018 13:56:11 -0500 Received: from e15.ny.us.ibm.com (e15.ny.us.ibm.com [129.33.205.205]) by mx0a-001b2d01.pphosted.com with ESMTP id 2ga2pjhydk-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Thu, 22 Feb 2018 13:56:10 -0500 Received: from localhost by e15.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 22 Feb 2018 13:56:09 -0500 Subject: Re: [PATCH] PCI/AER: Add a null check before eeh_ops->notify_resume callback. To: Vaibhav Jain , Russell Currey , Michael Ellerman Cc: Frederic Barrat , linuxppc-dev@lists.ozlabs.org, Benjamin Herrenschmidt References: <20180222115803.21738-1-vaibhav@linux.vnet.ibm.com> From: "Bryant G. Ly" Date: Thu, 22 Feb 2018 12:56:05 -0600 MIME-Version: 1.0 In-Reply-To: <20180222115803.21738-1-vaibhav@linux.vnet.ibm.com> Content-Type: text/plain; charset=utf-8 Message-Id: List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On 2/22/18 5:58 AM, Vaibhav Jain wrote: > This patch puts a NULL check before branching to the address pointed > to by eeh_ops->notify_resume in eeh_report_resume(). The callback > is used to notify the arch EEH code that a pci device is back > online. > > For PPC64 presently, only an implementation for pseries platform is > available and not for powernv. Hence without this patch EEH recovery > on all non-virtualized hosts is causing a kernel panic when > CONFIG_PCI_IOV is set. The panic is usually is of the form: > > EEH: Notify device driver to resume > Unable to handle kernel paging request for instruction fetch > Faulting instruction address: 0x00000000 > Oops: Kernel access of bad area, sig: 11 [#1] > > LR eeh_report_resume+0x218/0x220 > Call Trace: > eeh_report_resume+0x1f0/0x220 (unreliable) > eeh_pe_dev_traverse+0x98/0x170 > eeh_handle_normal_event+0x3f4/0x650 > eeh_handle_event+0x188/0x380 > eeh_event_handler+0x208/0x210 > kthread+0x168/0x1b0 > ret_from_kernel_thread+0x5c/0xb4 > > Cc: Bryant G. Ly > Fixes: 856e1eb9bdd4("PCI/AER: Add uevents in AER and EEH error/resume") > Signed-off-by: Vaibhav Jain > --- > arch/powerpc/kernel/eeh_driver.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c > index beea2182d754..932858a293ea 100644 > --- a/arch/powerpc/kernel/eeh_driver.c > +++ b/arch/powerpc/kernel/eeh_driver.c > @@ -384,7 +384,8 @@ static void *eeh_report_resume(void *data, void *userdata) > eeh_pcid_put(dev); > pci_uevent_ers(dev, PCI_ERS_RESULT_RECOVERED); > #ifdef CONFIG_PCI_IOV > - eeh_ops->notify_resume(eeh_dev_to_pdn(edev)); > + if (eeh_ops->notify_resume) > + eeh_ops->notify_resume(eeh_dev_to_pdn(edev)); > #endif > return NULL; > } A version of this patch already upstreamed. https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/commit/?h=fixes&id=521ca5a9859a870e354d1a6b84a6ff4c07bbceb0 -Bryant