From: Gavin Shan <gwshan@linux.vnet.ibm.com>
To: Gavin Shan <gwshan@linux.vnet.ibm.com>
Cc: David Gibson <david@gibson.dropbear.id.au>,
linuxppc-dev@lists.ozlabs.org, alistair@popple.id.au,
ruscur@russell.cc, mpe@ellerman.id.au
Subject: Re: [PATCH v2 1/3] powerpc/eeh: Ignore error handlers in eeh_pe_reset_and_recover()
Date: Wed, 27 Apr 2016 11:16:24 +1000 [thread overview]
Message-ID: <20160427011624.GA27020@gwshan> (raw)
In-Reply-To: <20160426101731.GA11928@gwshan>
On Tue, Apr 26, 2016 at 08:17:31PM +1000, Gavin Shan wrote:
>On Tue, Apr 26, 2016 at 03:29:59PM +1000, David Gibson wrote:
>>On Fri, Apr 22, 2016 at 11:28:02PM +1000, Gavin Shan wrote:
>>> The function eeh_pe_reset_and_recover() is used to recover EEH
>>> error when the passthrough device are transferred to guest and
>>> backwards, meaning the device's driver is vfio-pci or none.
>>> When the driver is vfio-pci that provides error_detected() error
>>> handler only, the handler simply stops the guest and it's not
>>> expected behaviour. On the other hand, no error handlers will
>>> be called if we don't have a bound driver.
>>>
>>> This ignores all error handlers provided by device driver in
>>> eeh_pe_reset_and_recover() to avoid the exceptional behaviour.
>>>
>>> Fixes: 5cfb20b9 ("powerpc/eeh: Emulate EEH recovery for VFIO devices")
>>> Cc: stable@vger.kernel.org #v3.18+
>>> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
>>> Reviewed-by: Russell Currey <ruscur@russell.cc>
>>> ---
>>> arch/powerpc/kernel/eeh_driver.c | 11 +----------
>>> 1 file changed, 1 insertion(+), 10 deletions(-)
>>>
>>> diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
>>> index fb6207d..1c7d703 100644
>>> --- a/arch/powerpc/kernel/eeh_driver.c
>>> +++ b/arch/powerpc/kernel/eeh_driver.c
>>> @@ -552,7 +552,7 @@ static int eeh_clear_pe_frozen_state(struct eeh_pe *pe,
>>>
>>> int eeh_pe_reset_and_recover(struct eeh_pe *pe)
>>> {
>>> - int result, ret;
>>> + int ret;
>>>
>>> /* Bail if the PE is being recovered */
>>> if (pe->state & EEH_PE_RECOVERING)
>>> @@ -564,9 +564,6 @@ int eeh_pe_reset_and_recover(struct eeh_pe *pe)
>>> /* Save states */
>>> eeh_pe_dev_traverse(pe, eeh_dev_save_state, NULL);
>>>
>>> - /* Report error */
>>> - eeh_pe_dev_traverse(pe, eeh_report_error, &result);
>>
>>Ok, so after chatting to Gavin, I've made sense of this. The basic
>>thing here is that eeh_pe_reset_and_recover() should be discarding any
>>errors from before the reset, not reporting them - the whole point is
>>that we know things have gone bad, and we want to clear back to a good
>>state.
>>
>>> /* Issue reset */
>>> ret = eeh_reset_pe(pe);
>>> if (ret) {
>>> @@ -581,15 +578,9 @@ int eeh_pe_reset_and_recover(struct eeh_pe *pe)
>>> return ret;
>>> }
>>>
>>> - /* Notify completion of reset */
>>> - eeh_pe_dev_traverse(pe, eeh_report_reset, &result);
>>
>>However, it's not clear if removing the report of a reset makes sense.
>>There are no current users of reset notification IIUC, but if we're
>>going to remove the reset reporting, we should put that in a separate
>>patch with its own justification, and remove the other caller as well.
>>
>
>Thanks, David. It makes sense to me. I will split it into two: one removes
>eeh_report_error notification and another removes the left notification
>handlers.
>
>>> /* Restore device state */
>>> eeh_pe_dev_traverse(pe, eeh_dev_restore_state, NULL);
>>>
>>> - /* Resume */
>>> - eeh_pe_dev_traverse(pe, eeh_report_resume, NULL);
>>
>>And I'm not sure if it makes sense to remove the resume notification either.
>>
>
>Based on the offline talk, we either keep all notification handlers or remove
>all of them. As we can't keep eeh_report_error, we have to remove all of them.
>
v3 was posted for further review. Please ignore this series.
>>> /* Clear recovery mode */
>>> eeh_pe_state_clear(pe, EEH_PE_RECOVERING);
>>>
>>
>>--
>>David Gibson | I'll have my music baroque, and my code
>>david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
>> | _way_ _around_!
>>http://www.ozlabs.org/~dgibson
>
>
prev parent reply other threads:[~2016-04-27 1:17 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-04-22 13:28 [PATCH v2 1/3] powerpc/eeh: Ignore error handlers in eeh_pe_reset_and_recover() Gavin Shan
2016-04-22 13:28 ` [PATCH v2 2/3] powerpc/eeh: Restore config from edev " Gavin Shan
2016-04-26 10:21 ` Gavin Shan
2016-04-22 13:28 ` [PATCH v2 3/3] powerpc/eeh: Drop unnecessary label in eeh_pe_change_owner() Gavin Shan
2016-04-26 5:29 ` [PATCH v2 1/3] powerpc/eeh: Ignore error handlers in eeh_pe_reset_and_recover() David Gibson
2016-04-26 10:17 ` Gavin Shan
2016-04-27 1:16 ` Gavin Shan [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160427011624.GA27020@gwshan \
--to=gwshan@linux.vnet.ibm.com \
--cc=alistair@popple.id.au \
--cc=david@gibson.dropbear.id.au \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mpe@ellerman.id.au \
--cc=ruscur@russell.cc \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).