* Re: [PATCH] powerpc/eeh: eeh_pci_enable(): fix checking of post-request state
2015-10-23 6:19 [PATCH] powerpc/eeh: eeh_pci_enable(): fix checking of post-request state Andrew Donnellan
@ 2015-10-24 1:23 ` Gavin Shan
2015-10-26 23:05 ` Daniel Axtens
` (3 subsequent siblings)
4 siblings, 0 replies; 9+ messages in thread
From: Gavin Shan @ 2015-10-24 1:23 UTC (permalink / raw)
To: Andrew Donnellan; +Cc: linuxppc-dev, Gavin Shan
On Fri, Oct 23, 2015 at 05:19:46PM +1100, Andrew Donnellan wrote:
>In eeh_pci_enable(), after making the request to set the new options, we
>call eeh_ops->wait_state() to check that the request finished successfully.
>
>At the moment, if eeh_ops->wait_state() returns 0, we return 0 without
>checking that it reflects the expected outcome. This can lead to callers
>further up the chain incorrectly assuming the slot has been successfully
>unfrozen and continuing to attempt recovery.
>
>On powernv, this will occur if pnv_eeh_get_pe_state() or
>pnv_eeh_get_phb_state() return 0, which in turn occurs if the relevant OPAL
>call returns OPAL_EEH_STOPPED_MMIO_DMA_FREEZE or
>OPAL_EEH_PHB_ERROR respectively.
>
>On pseries, this will occur if pseries_eeh_get_state() returns 0, which in
>turn occurs if RTAS reports that the PE is in the MMIO Stopped and DMA
>Stopped states.
>
>Obviously, none of these cases represent a successful completion of a
>request to thaw MMIO or DMA.
>
>Fix the check so that a wait_state() return value of 0 won't be considered
>successful for the EEH_OPT_THAW_MMIO or EEH_OPT_THAW_DMA cases.
>
>Cc: Gavin Shan <gwshan@linux.vnet.ibm.com>
>Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Thanks,
Gavin
>---
> arch/powerpc/kernel/eeh.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
>index 40e4d4a..d757e7c 100644
>--- a/arch/powerpc/kernel/eeh.c
>+++ b/arch/powerpc/kernel/eeh.c
>@@ -677,7 +677,7 @@ int eeh_pci_enable(struct eeh_pe *pe, int function)
> /* Check if the request is finished successfully */
> if (active_flag) {
> rc = eeh_ops->wait_state(pe, PCI_BUS_RESET_WAIT_MSEC);
>- if (rc <= 0)
>+ if (rc < 0)
> return rc;
>
> if (rc & active_flag)
>--
>Andrew Donnellan Software Engineer, OzLabs
>andrew.donnellan@au1.ibm.com Australia Development Lab, Canberra
>+61 2 6201 8874 (work) IBM Australia Limited
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] powerpc/eeh: eeh_pci_enable(): fix checking of post-request state
2015-10-23 6:19 [PATCH] powerpc/eeh: eeh_pci_enable(): fix checking of post-request state Andrew Donnellan
2015-10-24 1:23 ` Gavin Shan
@ 2015-10-26 23:05 ` Daniel Axtens
2015-10-26 23:09 ` Daniel Axtens
` (2 subsequent siblings)
4 siblings, 0 replies; 9+ messages in thread
From: Daniel Axtens @ 2015-10-26 23:05 UTC (permalink / raw)
To: Andrew Donnellan, linuxppc-dev; +Cc: Gavin Shan
[-- Attachment #1: Type: text/plain, Size: 948 bytes --]
Andrew Donnellan <andrew.donnellan@au1.ibm.com> writes:
> diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
> index 40e4d4a..d757e7c 100644
> --- a/arch/powerpc/kernel/eeh.c
> +++ b/arch/powerpc/kernel/eeh.c
> @@ -677,7 +677,7 @@ int eeh_pci_enable(struct eeh_pe *pe, int function)
> /* Check if the request is finished successfully */
> if (active_flag) {
> rc = eeh_ops->wait_state(pe, PCI_BUS_RESET_WAIT_MSEC);
> - if (rc <= 0)
> + if (rc < 0)
> return rc;
>
> if (rc & active_flag)
Reviewed-by: Daniel Axtens <dja@axtens.net>
> --
> Andrew Donnellan Software Engineer, OzLabs
> andrew.donnellan@au1.ibm.com Australia Development Lab, Canberra
> +61 2 6201 8874 (work) IBM Australia Limited
>
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 859 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] powerpc/eeh: eeh_pci_enable(): fix checking of post-request state
2015-10-23 6:19 [PATCH] powerpc/eeh: eeh_pci_enable(): fix checking of post-request state Andrew Donnellan
2015-10-24 1:23 ` Gavin Shan
2015-10-26 23:05 ` Daniel Axtens
@ 2015-10-26 23:09 ` Daniel Axtens
2016-02-08 11:37 ` Michael Ellerman
2016-03-09 12:51 ` Michael Ellerman
4 siblings, 0 replies; 9+ messages in thread
From: Daniel Axtens @ 2015-10-26 23:09 UTC (permalink / raw)
To: Andrew Donnellan, linuxppc-dev; +Cc: Gavin Shan
[-- Attachment #1: Type: text/plain, Size: 948 bytes --]
Andrew Donnellan <andrew.donnellan@au1.ibm.com> writes:
> diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
> index 40e4d4a..d757e7c 100644
> --- a/arch/powerpc/kernel/eeh.c
> +++ b/arch/powerpc/kernel/eeh.c
> @@ -677,7 +677,7 @@ int eeh_pci_enable(struct eeh_pe *pe, int function)
> /* Check if the request is finished successfully */
> if (active_flag) {
> rc = eeh_ops->wait_state(pe, PCI_BUS_RESET_WAIT_MSEC);
> - if (rc <= 0)
> + if (rc < 0)
> return rc;
>
> if (rc & active_flag)
Reviewed-by: Daniel Axtens <dja@axtens.net>
> --
> Andrew Donnellan Software Engineer, OzLabs
> andrew.donnellan@au1.ibm.com Australia Development Lab, Canberra
> +61 2 6201 8874 (work) IBM Australia Limited
>
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 859 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: powerpc/eeh: eeh_pci_enable(): fix checking of post-request state
2015-10-23 6:19 [PATCH] powerpc/eeh: eeh_pci_enable(): fix checking of post-request state Andrew Donnellan
` (2 preceding siblings ...)
2015-10-26 23:09 ` Daniel Axtens
@ 2016-02-08 11:37 ` Michael Ellerman
2016-02-08 23:57 ` Andrew Donnellan
2016-03-09 12:51 ` Michael Ellerman
4 siblings, 1 reply; 9+ messages in thread
From: Michael Ellerman @ 2016-02-08 11:37 UTC (permalink / raw)
To: Andrew Donnellan, linuxppc-dev; +Cc: Gavin Shan
On Fri, 2015-23-10 at 06:19:46 UTC, Andrew Donnellan wrote:
> In eeh_pci_enable(), after making the request to set the new options, we
> call eeh_ops->wait_state() to check that the request finished successfully.
>
> At the moment, if eeh_ops->wait_state() returns 0, we return 0 without
> checking that it reflects the expected outcome. This can lead to callers
> further up the chain incorrectly assuming the slot has been successfully
> unfrozen and continuing to attempt recovery.
>
> On powernv, this will occur if pnv_eeh_get_pe_state() or
> pnv_eeh_get_phb_state() return 0, which in turn occurs if the relevant OPAL
> call returns OPAL_EEH_STOPPED_MMIO_DMA_FREEZE or
> OPAL_EEH_PHB_ERROR respectively.
>
> On pseries, this will occur if pseries_eeh_get_state() returns 0, which in
> turn occurs if RTAS reports that the PE is in the MMIO Stopped and DMA
> Stopped states.
>
> Obviously, none of these cases represent a successful completion of a
> request to thaw MMIO or DMA.
>
> Fix the check so that a wait_state() return value of 0 won't be considered
> successful for the EEH_OPT_THAW_MMIO or EEH_OPT_THAW_DMA cases.
>
> Cc: Gavin Shan <gwshan@linux.vnet.ibm.com>
> Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
> Reviewed-by: Daniel Axtens <dja@axtens.net>
Looks like I missed this?
Should it still go in? Is it a fix? If so when did it break, and should it go
to stable?
cheers
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: powerpc/eeh: eeh_pci_enable(): fix checking of post-request state
2016-02-08 11:37 ` Michael Ellerman
@ 2016-02-08 23:57 ` Andrew Donnellan
2016-03-08 7:42 ` Andrew Donnellan
0 siblings, 1 reply; 9+ messages in thread
From: Andrew Donnellan @ 2016-02-08 23:57 UTC (permalink / raw)
To: Michael Ellerman, linuxppc-dev; +Cc: Gavin Shan, Daniel Axtens
On 08/02/16 22:37, Michael Ellerman wrote:
> Looks like I missed this?
>
> Should it still go in? Is it a fix? If so when did it break, and should it go
> to stable?
It is a fix - I'm a bit hazy on the details now but IIRC, Daniel Axtens
and I encountered this when doing some cxl debugging, though I think we
decided not to tag this for stable since it was a secondary issue to the
primary bug we were looking for. It probably could go to stable though?
(Daniel - thoughts?)
The line in question was last touched in 4d4f577e4b5e, but it looks like
the behaviour wasn't right even before that.
Andrew
--
Andrew Donnellan Software Engineer, OzLabs
andrew.donnellan@au1.ibm.com Australia Development Lab, Canberra
+61 2 6201 8874 (work) IBM Australia Limited
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: powerpc/eeh: eeh_pci_enable(): fix checking of post-request state
2016-02-08 23:57 ` Andrew Donnellan
@ 2016-03-08 7:42 ` Andrew Donnellan
2016-03-08 22:14 ` Gavin Shan
0 siblings, 1 reply; 9+ messages in thread
From: Andrew Donnellan @ 2016-03-08 7:42 UTC (permalink / raw)
To: Michael Ellerman, linuxppc-dev; +Cc: Gavin Shan, Daniel Axtens
On 09/02/16 10:57, Andrew Donnellan wrote:
> It is a fix - I'm a bit hazy on the details now but IIRC, Daniel Axtens
> and I encountered this when doing some cxl debugging, though I think we
> decided not to tag this for stable since it was a secondary issue to the
> primary bug we were looking for. It probably could go to stable though?
> (Daniel - thoughts?)
>
> The line in question was last touched in 4d4f577e4b5e, but it looks like
> the behaviour wasn't right even before that.
Ping :)
I think this patch should still go in, I don't really care whether it
goes to stable or not.
--
Andrew Donnellan Software Engineer, OzLabs
andrew.donnellan@au1.ibm.com Australia Development Lab, Canberra
+61 2 6201 8874 (work) IBM Australia Limited
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: powerpc/eeh: eeh_pci_enable(): fix checking of post-request state
2016-03-08 7:42 ` Andrew Donnellan
@ 2016-03-08 22:14 ` Gavin Shan
0 siblings, 0 replies; 9+ messages in thread
From: Gavin Shan @ 2016-03-08 22:14 UTC (permalink / raw)
To: Andrew Donnellan
Cc: Michael Ellerman, linuxppc-dev, Gavin Shan, Daniel Axtens
On Tue, Mar 08, 2016 at 06:42:56PM +1100, Andrew Donnellan wrote:
>On 09/02/16 10:57, Andrew Donnellan wrote:
>>It is a fix - I'm a bit hazy on the details now but IIRC, Daniel Axtens
>>and I encountered this when doing some cxl debugging, though I think we
>>decided not to tag this for stable since it was a secondary issue to the
>>primary bug we were looking for. It probably could go to stable though?
>>(Daniel - thoughts?)
>>
>>The line in question was last touched in 4d4f577e4b5e, but it looks like
>>the behaviour wasn't right even before that.
>
>Ping :)
>
>I think this patch should still go in, I don't really care whether it goes to
>stable or not.
>
Andrew, I lost the context about this. If you need any helps like review,
please ping me from IRC.
Thanks,
Gavin
>--
>Andrew Donnellan Software Engineer, OzLabs
>andrew.donnellan@au1.ibm.com Australia Development Lab, Canberra
>+61 2 6201 8874 (work) IBM Australia Limited
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: powerpc/eeh: eeh_pci_enable(): fix checking of post-request state
2015-10-23 6:19 [PATCH] powerpc/eeh: eeh_pci_enable(): fix checking of post-request state Andrew Donnellan
` (3 preceding siblings ...)
2016-02-08 11:37 ` Michael Ellerman
@ 2016-03-09 12:51 ` Michael Ellerman
4 siblings, 0 replies; 9+ messages in thread
From: Michael Ellerman @ 2016-03-09 12:51 UTC (permalink / raw)
To: Andrew Donnellan, linuxppc-dev; +Cc: Gavin Shan
On Fri, 2015-23-10 at 06:19:46 UTC, Andrew Donnellan wrote:
> In eeh_pci_enable(), after making the request to set the new options, we
> call eeh_ops->wait_state() to check that the request finished successfully.
>
> At the moment, if eeh_ops->wait_state() returns 0, we return 0 without
> checking that it reflects the expected outcome. This can lead to callers
> further up the chain incorrectly assuming the slot has been successfully
> unfrozen and continuing to attempt recovery.
>
> On powernv, this will occur if pnv_eeh_get_pe_state() or
> pnv_eeh_get_phb_state() return 0, which in turn occurs if the relevant OPAL
> call returns OPAL_EEH_STOPPED_MMIO_DMA_FREEZE or
> OPAL_EEH_PHB_ERROR respectively.
>
> On pseries, this will occur if pseries_eeh_get_state() returns 0, which in
> turn occurs if RTAS reports that the PE is in the MMIO Stopped and DMA
> Stopped states.
>
> Obviously, none of these cases represent a successful completion of a
> request to thaw MMIO or DMA.
>
> Fix the check so that a wait_state() return value of 0 won't be considered
> successful for the EEH_OPT_THAW_MMIO or EEH_OPT_THAW_DMA cases.
>
> Cc: Gavin Shan <gwshan@linux.vnet.ibm.com>
> Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
> Reviewed-by: Daniel Axtens <dja@axtens.net>
> Reviewed-by: Daniel Axtens <dja@axtens.net>
Applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/949e9b827eb4736d96df520c67
cheers
^ permalink raw reply [flat|nested] 9+ messages in thread