From: Alex Williamson <alex.williamson@redhat.com>
To: Gavin Shan <gwshan@linux.vnet.ibm.com>
Cc: aik@ozlabs.ru, agraf@suse.de, kvm-ppc@vger.kernel.org,
qiudayu@linux.vnet.ibm.com, linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH v6 2/3] drivers/vfio: EEH support for VFIO PCI device
Date: Tue, 27 May 2014 11:39:54 -0600 [thread overview]
Message-ID: <1401212394.3289.590.camel@ul30vt.home> (raw)
In-Reply-To: <20140524020620.GB4900@shangw>
On Sat, 2014-05-24 at 12:06 +1000, Gavin Shan wrote:
> On Fri, May 23, 2014 at 08:29:59AM -0600, Alex Williamson wrote:
> >On Fri, 2014-05-23 at 14:37 +1000, Gavin Shan wrote:
> >> On Thu, May 22, 2014 at 09:10:53PM -0600, Alex Williamson wrote:
> >> >On Thu, 2014-05-22 at 18:23 +1000, Gavin Shan wrote:
>
> .../...
>
> >No, sorry, I mean how does the user get information about the error?
> >The interface we have here is:
> >a) find that something bad has happened
> >b) kick it into working again
> >c) continue
> >
> >How does the user figure out what happened and if it makes sense to
> >attempt to recover? Where does the user learn that their disk is on
> >fire?
> >
>
> When 0xFF's returned from config or IO read, user should check the
> device (PE)'s state with ioctl command VFIO_EEH_PE_GET_STATE. If the
> device (PE) has been put into "frozen" state, It's confirmed the device
> ("disk" you mentioned) is on fire.
No, this only confirms that something bad happened, not _what_ bad thing
happened.
> User should kick off recovery, which
> includes:
And here you're just describing the kick operation again...
>
> - User stops any operatins (config, IO, DMA) on the device because any
> PCI traffic to "frozen" device will be dropped from software or hardware
> level. Also, we don't expect DMA traffic during recovery. Otherwise,
> we will bump into recursive errors and the recovery should fail.
> - VFIO_EEH_PE_SET_OPTION to enable I/O path ("DMA" path is still under frozen
> state). EEH_VFIO_PE_CONFIGURE to reconfigure affected PCI bridges and then
> do error log retrieval.
These logs, where do they go? How does the user get access? That's
what I'm trying to ask about.
> - VFIO_EEH_PE_RESET to reset the affected device (PE). EEH_VFIO_PE_CONFIUGRE
> to restore BARs.
> - User resumes the device to start PCI traffic and device is brought to
> funtional state.
>
> .../...
>
> >
> >No, I prefer to stay consistent with the rest of the VFIO API and use
> >argsz + flags.
> >
>
> Here's the recap for previous reply: I have several cases for ioctl().
>
> - ioctl(fd, cmd, NULL): I needn't any input info.
> - ioctl(fd, cmd, &data): I need input info
>
> For all the cases, should I simply have a data struct to include "argsz+flags"?
Anything that requires data should have argsz+flags, if it doesn't
require data, it doesn't need them, but think long an hard about whether
there's any possibility that we'll need parameters in the future.
> For return value from ioctl(), can we simply to have additional field in the
> above data struct to carry it? "0" is the information I have to return for
> some of the cases.
If for instance your ioctl is returning something like "number of
errors", then it's perfectly fine to use that as the ioctl return. <0
is error, >= zero is a success with value.
next prev parent reply other threads:[~2014-05-27 17:40 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-22 8:23 [PATCH v6 0/3] EEH Support for VFIO PCI device Gavin Shan
2014-05-22 8:23 ` [PATCH v6 1/3] powerpc/eeh: Flags for passed device and PE Gavin Shan
2014-05-22 8:23 ` [PATCH v6 2/3] drivers/vfio: EEH support for VFIO PCI device Gavin Shan
2014-05-22 9:55 ` Alexander Graf
2014-05-23 0:17 ` Gavin Shan
2014-05-23 0:37 ` Gavin Shan
2014-05-23 3:23 ` Alex Williamson
2014-05-23 6:52 ` Alexander Graf
2014-05-23 11:58 ` Gavin Shan
2014-05-23 12:30 ` Alexander Graf
2014-05-23 14:49 ` Alex Williamson
2014-05-24 1:37 ` Gavin Shan
2014-05-23 12:51 ` Alex Williamson
2014-05-23 13:24 ` Alexander Graf
2014-05-23 3:10 ` Alex Williamson
2014-05-23 4:37 ` Gavin Shan
2014-05-23 5:00 ` Benjamin Herrenschmidt
2014-05-23 14:36 ` Alex Williamson
2014-05-23 6:55 ` Alexander Graf
2014-05-23 7:37 ` Gavin Shan
2014-05-23 9:58 ` Alexander Graf
2014-05-23 11:55 ` Gavin Shan
2014-05-23 11:58 ` Alexander Graf
2014-05-23 12:43 ` Gavin Shan
2014-05-23 12:49 ` Alexander Graf
2014-05-24 1:46 ` Gavin Shan
2014-05-23 14:29 ` Alex Williamson
2014-05-24 2:06 ` Gavin Shan
2014-05-27 17:39 ` Alex Williamson [this message]
2014-05-22 8:23 ` [PATCH v6 3/3] powerpc/eeh: Avoid event on passed PE Gavin Shan
2014-05-22 9:55 ` Alexander Graf
2014-05-23 0:01 ` Gavin Shan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1401212394.3289.590.camel@ul30vt.home \
--to=alex.williamson@redhat.com \
--cc=agraf@suse.de \
--cc=aik@ozlabs.ru \
--cc=gwshan@linux.vnet.ibm.com \
--cc=kvm-ppc@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=qiudayu@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).