From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from e23smtp06.au.ibm.com (e23smtp06.au.ibm.com [202.81.31.148]) (using TLSv1 with cipher CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 7C01D1A0145 for ; Fri, 25 Sep 2015 18:20:39 +1000 (AEST) Received: from /spool/local by e23smtp06.au.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 25 Sep 2015 18:20:37 +1000 Received: from d23relay08.au.ibm.com (d23relay08.au.ibm.com [9.185.71.33]) by d23dlp01.au.ibm.com (Postfix) with ESMTP id 09A542CE8052 for ; Fri, 25 Sep 2015 18:20:34 +1000 (EST) Received: from d23av01.au.ibm.com (d23av01.au.ibm.com [9.190.234.96]) by d23relay08.au.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id t8P8KFvX197080 for ; Fri, 25 Sep 2015 18:20:23 +1000 Received: from d23av01.au.ibm.com (localhost [127.0.0.1]) by d23av01.au.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id t8P8K1BJ005044 for ; Fri, 25 Sep 2015 18:20:01 +1000 Date: Fri, 25 Sep 2015 16:19:36 +0800 From: Wei Yang To: Gavin Shan Cc: Wei Yang , linuxppc-dev@lists.ozlabs.org Subject: Re: [PATCH] powerpc/eeh: Avoid to handle EEH on a passed Child PE Message-ID: <20150925081936.GA8618@richards-mbp.cn.ibm.com> Reply-To: Wei Yang References: <1442827788-10385-1-git-send-email-weiyang@linux.vnet.ibm.com> <20150921114945.GA17285@gwshan> <20150922044303.GB2072@Richards-MacBook-Pro.local> <20150922230741.GA6721@gwshan> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii In-Reply-To: <20150922230741.GA6721@gwshan> List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On Wed, Sep 23, 2015 at 09:07:41AM +1000, Gavin Shan wrote: >On Tue, Sep 22, 2015 at 12:43:03PM +0800, Wei Yang wrote: >>On Mon, Sep 21, 2015 at 09:49:45PM +1000, Gavin Shan wrote: >>>On Mon, Sep 21, 2015 at 05:29:48PM +0800, Wei Yang wrote: >>>>Current EEH infrastructure would avoid to handle EEH when a PE is passed to >>>>guest, while if this PE is a Child PE of the one hit EEH, host would handle >>>>this. By doing so, this would leads to guest hang. The correct way is >>>>avoid to handle it on host and let guest to recover. >>>> >>>>This patch avoids to handle EEH on a passed Child PE. >>>> >>> >>>Ok. It's fixing the problem the guest, which owns a VF, when its PF hitting >>>EEH error, right? If so, I'm not sure if you really tested this code. Does >>>it work for you? >> >>Yes, I inject error on Parent Bus PE. >> >>> >>>When the parent PE (PF) is stopped for EEH recovery, it sounds impossible >>>that the child PE can't be affected and just escape from the error. The >>>question is how the guest can continue to work after the EEH recovery on >>>parent PE? >> >>What I see is the PF is covering and VF in guest is recovering. >> > >What do you mean by "covering"? Which PE's error is detected first in your >testing? There is potentially race here: when the VF PE's error is detected >first and guest tries to recover it. After the recovery happened on guest >side, the host detects the PF PE's error and tries to recover it. During >the recovery, the VF PE is total unusable but guest doesn't know that and >operate like usual. I'm not sure what kinds of problems it can incur, but >it would incur issues. > >On the other hand, if PF PE's error is detected on host first, and then the >guest detects the error on VF PE. After that, the host and guest try to do >recovery at same time. Host issues PE reset to PF PE, which isn't finished >yet. Then guest issues PE reset to VF PE, which will cause another EEH error. > >So I'm not sure if had this patch fully tested. If so, it seems the result is >just achieved by luck, I guess. > It looks really has some race condition. So the expected order should be PF's PE recovered then VF's PE recover in guest, right? >>> >>>>Signed-off-by: Wei Yang >>>>--- >>>> arch/powerpc/kernel/eeh_pe.c | 5 +++++ >>>> 1 file changed, 5 insertions(+) >>>> >>>>diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c >>>>index 5cde950..c6d0e9f 100644 >>>>--- a/arch/powerpc/kernel/eeh_pe.c >>>>+++ b/arch/powerpc/kernel/eeh_pe.c >>>>@@ -172,6 +172,7 @@ static struct eeh_pe *eeh_pe_next(struct eeh_pe *pe, >>>> * callback returns something other than NULL, or no more PEs >>>> * to be traversed. >>>> */ >>>>+static void *__eeh_pe_get(void *data, void *flag); >>>> void *eeh_pe_traverse(struct eeh_pe *root, >>>> eeh_traverse_func fn, void *flag) >>>> { >>>>@@ -179,6 +180,8 @@ void *eeh_pe_traverse(struct eeh_pe *root, >>>> void *ret; >>>> >>>> for (pe = root; pe; pe = eeh_pe_next(pe, root)) { >>>>+ if (eeh_pe_passed(pe) && (fn != __eeh_pe_get)) >>>>+ continue; >>> >>>The code change here seems ugly. >>> >>>The "flag" can be extended to carry the information to skip pass-through >>>PEs or not. So the function calling eeh_pe_traverse() decides to skip >>>pass-through PEs or not. >> >>I don't get the point, which "flag" you mean? Add a flag in eeh_pe? >> > >>>> void *eeh_pe_traverse(struct eeh_pe *root, >>>> eeh_traverse_func fn, void *flag) > ^^^^^^^^^^ > This one > >The code needn't to be changed in a hurry though. I don't think it's right >way to fix the issue as discussed as above. > >>> >>>> ret = fn(pe, flag); >>>> if (ret) return ret; >>>> } >>>>@@ -210,6 +213,8 @@ void *eeh_pe_dev_traverse(struct eeh_pe *root, >>> >>>> >>>> /* Traverse root PE */ >>>> for (pe = root; pe; pe = eeh_pe_next(pe, root)) { >>>>+ if (eeh_pe_passed(pe)) >>>>+ continue; >>>> eeh_pe_for_each_dev(pe, edev, tmp) { >>>> ret = fn(edev, flag); >>>> if (ret) >>>>-- >>>>2.5.0 >>>> >> >>-- >>Richard Yang >>Help you, Help me -- Richard Yang Help you, Help me