qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Cao jin <caoj.fnst@cn.fujitsu.com>
Cc: qemu-devel@nongnu.org, alex.williamson@redhat.com,
	izumi.taku@jp.fujitsu.com,
	Chen Fan <chen.fan.fnst@cn.fujitsu.com>,
	Dou Liyang <douly.fnst@cn.fujitsu.com>
Subject: Re: [Qemu-devel] [PATCH RFC v11 3/4] vfio-pci: pass the aer error to guest
Date: Tue, 10 Jan 2017 00:55:00 +0200	[thread overview]
Message-ID: <20170110005437-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <1483175588-17006-4-git-send-email-caoj.fnst@cn.fujitsu.com>

On Sat, Dec 31, 2016 at 05:13:07PM +0800, Cao jin wrote:
> From: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
> 
> When physical device has uncorrectable error hanppened, the vfio_pci
> driver will signal the uncorrectable error status register value to
> corresponding QEMU's vfio-pci device via the eventfd registered by this
> device, then, the vfio-pci's error eventfd handler will be invoked in
> event loop.
> 
> Construct and pass the aer message to root port, root port will trigger an
> interrupt to signal guest, then, the guest driver will do the recovery.
> 
> Note: Now only support non-fatal error's recovery, fatal error will
> still result in vm stop.
> 
> Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
> Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
> Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
> ---
>  hw/vfio/pci.c | 50 ++++++++++++++++++++++++++++++++++++++++++--------
>  1 file changed, 42 insertions(+), 8 deletions(-)
> 
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index 76a8ac3..9861f72 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -2470,21 +2470,55 @@ static void vfio_put_device(VFIOPCIDevice *vdev)
>  static void vfio_err_notifier_handler(void *opaque)
>  {
>      VFIOPCIDevice *vdev = opaque;
> +    PCIDevice *dev = &vdev->pdev;
> +    PCIEAERMsg msg = {
> +        .severity = 0,
> +        .source_id = (pci_bus_num(dev->bus) << 8) | dev->devfn,
> +    };
> +    int len;
> +    uint64_t uncor_status;
> +
> +    /* Read uncorrectable error status from driver */
> +    len = read(vdev->err_notifier.rfd, &uncor_status, sizeof(uncor_status));
> +    if (len != sizeof(uncor_status)) {
> +        error_report("vfio-pci: uncor error status reading returns"
> +                     " invalid number of bytes: %d", len);
> +        return; //Or goto stop?

I would definitely suggest this to make sure we don't regress.

> +    }
> +
> +    if (!(vdev->features & VFIO_FEATURE_ENABLE_AER)) {
> +        goto stop;
> +    }
> +
> +    /* Populate the aer msg and send it to root port */
> +    if (dev->exp.aer_cap) {
> +        uint8_t *aer_cap = dev->config + dev->exp.aer_cap;
> +        bool isfatal = uncor_status &
> +                       pci_get_long(aer_cap + PCI_ERR_UNCOR_SEVER);
> +
> +	if (isfatal) {
> +	    goto stop;
> +	}
> +
> +        msg.severity = isfatal ? PCI_ERR_ROOT_CMD_FATAL_EN :
> +                                 PCI_ERR_ROOT_CMD_NONFATAL_EN;
>  
> -    if (!event_notifier_test_and_clear(&vdev->err_notifier)) {
> +        error_report("vfio-pci device %d sending AER to root port. uncor"
> +                     " status = 0x%"PRIx64, dev->devfn, uncor_status);
> +        pcie_aer_msg(dev, &msg);
>          return;
>      }
>  
> +stop:
>      /*
> -     * TBD. Retrieve the error details and decide what action
> -     * needs to be taken. One of the actions could be to pass
> -     * the error to the guest and have the guest driver recover
> -     * from the error. This requires that PCIe capabilities be
> -     * exposed to the guest. For now, we just terminate the
> -     * guest to contain the error.
> +     * Terminate the guest in case of
> +     * 1. AER capability is not exposed to guest.
> +     * 2. AER capability is exposed, but error is fatal, only non-fatal
> +     * error is handled now.
>       */
>  
> -    error_report("%s(%s) Unrecoverable error detected. Please collect any data possible and then kill the guest", __func__, vdev->vbasedev.name);
> +    error_report("%s(%s) fatal error detected. Please collect any data"
> +            " possible and then kill the guest", __func__, vdev->vbasedev.name);
>  
>      vm_stop(RUN_STATE_INTERNAL_ERROR);
>  }
> -- 
> 1.8.3.1
> 
> 

  reply	other threads:[~2017-01-09 22:55 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-31  9:13 [Qemu-devel] [PATCH RFC v11 0/4] vfio-pci: pass non-fatal error to guest Cao jin
2016-12-31  9:13 ` [Qemu-devel] [PATCH v11 1/4] pcie_aer: support configurable AER capa version Cao jin
2016-12-31  9:13 ` [Qemu-devel] [PATCH RFC v11 2/4] vfio: new function to init aer cap for vfio device Cao jin
2017-01-18 22:09   ` Alex Williamson
2017-01-20  6:03     ` Cao jin
2017-01-20 18:12       ` Alex Williamson
2016-12-31  9:13 ` [Qemu-devel] [PATCH RFC v11 3/4] vfio-pci: pass the aer error to guest Cao jin
2017-01-09 22:55   ` Michael S. Tsirkin [this message]
2017-01-18 22:31   ` Alex Williamson
2017-01-20  6:50     ` Cao jin
2017-01-20  6:57     ` Tian, Kevin
2017-01-20 18:21       ` Alex Williamson
2017-01-22  4:43         ` Tian, Kevin
2016-12-31  9:13 ` [Qemu-devel] [PATCH RFC v11 4/4] vfio: add 'aer' property to expose aercap Cao jin
2017-01-18 22:36   ` Alex Williamson
2017-01-20  6:04     ` Cao jin
2017-01-20 18:01       ` Alex Williamson
2016-12-31  9:17 ` [Qemu-devel] [PATCH RFC v11 0/4] vfio-pci: pass non-fatal error to guest no-reply
2017-01-18 21:43 ` Alex Williamson
2017-01-19  6:15   ` Cao jin
2017-01-19  6:25   ` Cao jin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170110005437-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=caoj.fnst@cn.fujitsu.com \
    --cc=chen.fan.fnst@cn.fujitsu.com \
    --cc=douly.fnst@cn.fujitsu.com \
    --cc=izumi.taku@jp.fujitsu.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).