linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Gavin Shan <gwshan@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org, alistair@popple.id.au,
	ruscur@russell.cc, mpe@ellerman.id.au
Subject: Re: [PATCH v2 1/3] powerpc/eeh: Ignore error handlers in eeh_pe_reset_and_recover()
Date: Tue, 26 Apr 2016 15:29:59 +1000	[thread overview]
Message-ID: <20160426052959.GJ15176@voom.fritz.box> (raw)
In-Reply-To: <1461331687-1069-1-git-send-email-gwshan@linux.vnet.ibm.com>

[-- Attachment #1: Type: text/plain, Size: 3060 bytes --]

On Fri, Apr 22, 2016 at 11:28:02PM +1000, Gavin Shan wrote:
> The function eeh_pe_reset_and_recover() is used to recover EEH
> error when the passthrough device are transferred to guest and
> backwards, meaning the device's driver is vfio-pci or none.
> When the driver is vfio-pci that provides error_detected() error
> handler only, the handler simply stops the guest and it's not
> expected behaviour. On the other hand, no error handlers will
> be called if we don't have a bound driver.
> 
> This ignores all error handlers provided by device driver in
> eeh_pe_reset_and_recover() to avoid the exceptional behaviour.
> 
> Fixes: 5cfb20b9 ("powerpc/eeh: Emulate EEH recovery for VFIO devices")
> Cc: stable@vger.kernel.org #v3.18+
> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
> Reviewed-by: Russell Currey <ruscur@russell.cc>
> ---
>  arch/powerpc/kernel/eeh_driver.c | 11 +----------
>  1 file changed, 1 insertion(+), 10 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
> index fb6207d..1c7d703 100644
> --- a/arch/powerpc/kernel/eeh_driver.c
> +++ b/arch/powerpc/kernel/eeh_driver.c
> @@ -552,7 +552,7 @@ static int eeh_clear_pe_frozen_state(struct eeh_pe *pe,
>  
>  int eeh_pe_reset_and_recover(struct eeh_pe *pe)
>  {
> -	int result, ret;
> +	int ret;
>  
>  	/* Bail if the PE is being recovered */
>  	if (pe->state & EEH_PE_RECOVERING)
> @@ -564,9 +564,6 @@ int eeh_pe_reset_and_recover(struct eeh_pe *pe)
>  	/* Save states */
>  	eeh_pe_dev_traverse(pe, eeh_dev_save_state, NULL);
>  
> -	/* Report error */
> -	eeh_pe_dev_traverse(pe, eeh_report_error, &result);

Ok, so after chatting to Gavin, I've made sense of this.  The basic
thing here is that eeh_pe_reset_and_recover() should be discarding any
errors from before the reset, not reporting them - the whole point is
that we know things have gone bad, and we want to clear back to a good
state.

>  	/* Issue reset */
>  	ret = eeh_reset_pe(pe);
>  	if (ret) {
> @@ -581,15 +578,9 @@ int eeh_pe_reset_and_recover(struct eeh_pe *pe)
>  		return ret;
>  	}
>  
> -	/* Notify completion of reset */
> -	eeh_pe_dev_traverse(pe, eeh_report_reset, &result);

However, it's not clear if removing the report of a reset makes sense.
There are no current users of reset notification IIUC, but if we're
going to remove the reset reporting, we should put that in a separate
patch with its own justification, and remove the other caller as well.

>  	/* Restore device state */
>  	eeh_pe_dev_traverse(pe, eeh_dev_restore_state, NULL);
>  
> -	/* Resume */
> -	eeh_pe_dev_traverse(pe, eeh_report_resume, NULL);

And I'm not sure if it makes sense to remove the resume notification either.

>  	/* Clear recovery mode */
>  	eeh_pe_state_clear(pe, EEH_PE_RECOVERING);
>  

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

  parent reply	other threads:[~2016-04-26  5:49 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-22 13:28 [PATCH v2 1/3] powerpc/eeh: Ignore error handlers in eeh_pe_reset_and_recover() Gavin Shan
2016-04-22 13:28 ` [PATCH v2 2/3] powerpc/eeh: Restore config from edev " Gavin Shan
2016-04-26 10:21   ` Gavin Shan
2016-04-22 13:28 ` [PATCH v2 3/3] powerpc/eeh: Drop unnecessary label in eeh_pe_change_owner() Gavin Shan
2016-04-26  5:29 ` David Gibson [this message]
2016-04-26 10:17   ` [PATCH v2 1/3] powerpc/eeh: Ignore error handlers in eeh_pe_reset_and_recover() Gavin Shan
2016-04-27  1:16     ` Gavin Shan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160426052959.GJ15176@voom.fritz.box \
    --to=david@gibson.dropbear.id.au \
    --cc=alistair@popple.id.au \
    --cc=gwshan@linux.vnet.ibm.com \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mpe@ellerman.id.au \
    --cc=ruscur@russell.cc \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).