From: Gavin Shan <gwshan@linux.vnet.ibm.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: aik@ozlabs.ru, Gavin Shan <gwshan@linux.vnet.ibm.com>,
kvm-ppc@vger.kernel.org, agraf@suse.de,
qiudayu@linux.vnet.ibm.com, linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH v6 2/3] drivers/vfio: EEH support for VFIO PCI device
Date: Sat, 24 May 2014 12:06:21 +1000 [thread overview]
Message-ID: <20140524020620.GB4900@shangw> (raw)
In-Reply-To: <1400855399.3289.450.camel@ul30vt.home>
On Fri, May 23, 2014 at 08:29:59AM -0600, Alex Williamson wrote:
>On Fri, 2014-05-23 at 14:37 +1000, Gavin Shan wrote:
>> On Thu, May 22, 2014 at 09:10:53PM -0600, Alex Williamson wrote:
>> >On Thu, 2014-05-22 at 18:23 +1000, Gavin Shan wrote:
.../...
>No, sorry, I mean how does the user get information about the error?
>The interface we have here is:
>a) find that something bad has happened
>b) kick it into working again
>c) continue
>
>How does the user figure out what happened and if it makes sense to
>attempt to recover? Where does the user learn that their disk is on
>fire?
>
When 0xFF's returned from config or IO read, user should check the
device (PE)'s state with ioctl command VFIO_EEH_PE_GET_STATE. If the
device (PE) has been put into "frozen" state, It's confirmed the device
("disk" you mentioned) is on fire. User should kick off recovery, which
includes:
- User stops any operatins (config, IO, DMA) on the device because any
PCI traffic to "frozen" device will be dropped from software or hardware
level. Also, we don't expect DMA traffic during recovery. Otherwise,
we will bump into recursive errors and the recovery should fail.
- VFIO_EEH_PE_SET_OPTION to enable I/O path ("DMA" path is still under frozen
state). EEH_VFIO_PE_CONFIGURE to reconfigure affected PCI bridges and then
do error log retrieval.
- VFIO_EEH_PE_RESET to reset the affected device (PE). EEH_VFIO_PE_CONFIUGRE
to restore BARs.
- User resumes the device to start PCI traffic and device is brought to
funtional state.
.../...
>
>No, I prefer to stay consistent with the rest of the VFIO API and use
>argsz + flags.
>
Here's the recap for previous reply: I have several cases for ioctl().
- ioctl(fd, cmd, NULL): I needn't any input info.
- ioctl(fd, cmd, &data): I need input info
For all the cases, should I simply have a data struct to include "argsz+flags"?
For return value from ioctl(), can we simply to have additional field in the
above data struct to carry it? "0" is the information I have to return for
some of the cases.
.../...
>As agraf noted, I'm asking why reset and configure are separate when
>they seem to be used together.
>
Ok. It's the recap: they're 2 separate steps of error recovery as
defined in PAPR spec. Also, they correspond to 2 separate RTAS calls.
So I don't think we can put them together.
Thanks,
Gavin
next prev parent reply other threads:[~2014-05-24 2:06 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-22 8:23 [PATCH v6 0/3] EEH Support for VFIO PCI device Gavin Shan
2014-05-22 8:23 ` [PATCH v6 1/3] powerpc/eeh: Flags for passed device and PE Gavin Shan
2014-05-22 8:23 ` [PATCH v6 2/3] drivers/vfio: EEH support for VFIO PCI device Gavin Shan
2014-05-22 9:55 ` Alexander Graf
2014-05-23 0:17 ` Gavin Shan
2014-05-23 0:37 ` Gavin Shan
2014-05-23 3:23 ` Alex Williamson
2014-05-23 6:52 ` Alexander Graf
2014-05-23 11:58 ` Gavin Shan
2014-05-23 12:30 ` Alexander Graf
2014-05-23 14:49 ` Alex Williamson
2014-05-24 1:37 ` Gavin Shan
2014-05-23 12:51 ` Alex Williamson
2014-05-23 13:24 ` Alexander Graf
2014-05-23 3:10 ` Alex Williamson
2014-05-23 4:37 ` Gavin Shan
2014-05-23 5:00 ` Benjamin Herrenschmidt
2014-05-23 14:36 ` Alex Williamson
2014-05-23 6:55 ` Alexander Graf
2014-05-23 7:37 ` Gavin Shan
2014-05-23 9:58 ` Alexander Graf
2014-05-23 11:55 ` Gavin Shan
2014-05-23 11:58 ` Alexander Graf
2014-05-23 12:43 ` Gavin Shan
2014-05-23 12:49 ` Alexander Graf
2014-05-24 1:46 ` Gavin Shan
2014-05-23 14:29 ` Alex Williamson
2014-05-24 2:06 ` Gavin Shan [this message]
2014-05-27 17:39 ` Alex Williamson
2014-05-22 8:23 ` [PATCH v6 3/3] powerpc/eeh: Avoid event on passed PE Gavin Shan
2014-05-22 9:55 ` Alexander Graf
2014-05-23 0:01 ` Gavin Shan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140524020620.GB4900@shangw \
--to=gwshan@linux.vnet.ibm.com \
--cc=agraf@suse.de \
--cc=aik@ozlabs.ru \
--cc=alex.williamson@redhat.com \
--cc=kvm-ppc@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=qiudayu@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).