All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: izumi.taku@jp.fujitsu.com, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH v5 5/7] vfio-pci: pass the aer error to guest
Date: Mon, 16 Mar 2015 15:35:13 +0800	[thread overview]
Message-ID: <550687B1.7020504@cn.fujitsu.com> (raw)
In-Reply-To: <1426477927.3643.160.camel@redhat.com>


On 03/16/2015 11:52 AM, Alex Williamson wrote:
> On Mon, 2015-03-16 at 11:05 +0800, Chen Fan wrote:
>> On 03/14/2015 06:34 AM, Alex Williamson wrote:
>>> On Thu, 2015-03-12 at 18:23 +0800, Chen Fan wrote:
>>>> when the vfio device encounters an uncorrectable error in host,
>>>> the vfio_pci driver will signal the eventfd registered by this
>>>> vfio device, the results in the qemu eventfd handler getting
>>>> invoked.
>>>>
>>>> this patch is to pass the error to guest and have the guest driver
>>>> recover from the error.
>>> What is going to be the typical recovery mechanism for the guest?  I'm
>>> concerned that the topology of the device in the guest doesn't
>>> necessarily match the topology of the device in the host, so if the
>>> guest were to attempt a bus reset to recover a device, for instance,
>>> what happens?
>> the recovery mechanism is that when guest got an aer error from a device,
>> guest will clean the corresponding status bit in device register. and for
>> need reset device, the guest aer driver would reset all devices under bus.
> Sorry, I'm still confused, how does the guest aer driver reset all
> devices under a bus?  Are we talking about function-level, device
> specific reset mechanisms or secondary bus resets?  If the guest is
> performing secondary bus resets, what guarantee do they have that it
> will translate to a physical secondary bus reset?  vfio may only do an
> FLR when the bus is reset or it may not be able to do anything depending
> on the available function-level resets and physical and virtual topology
> of the device.  Thanks,
in general, functions depends on the corresponding device driver behaviors
to do the recovery. e.g: implemented the error_detect, slot_reset callbacks.
and for link reset, it usually do secondary bus reset.

and do we must require to the physical secondary bus reset for vfio device
as bus reset?

Thanks,
Chen

>
> Alex
>
>>>> Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
>>>> ---
>>>>    hw/vfio/pci.c | 34 ++++++++++++++++++++++++++++------
>>>>    1 file changed, 28 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>>>> index 0a515b6..8966c49 100644
>>>> --- a/hw/vfio/pci.c
>>>> +++ b/hw/vfio/pci.c
>>>> @@ -3240,18 +3240,40 @@ static void vfio_put_device(VFIOPCIDevice *vdev)
>>>>    static void vfio_err_notifier_handler(void *opaque)
>>>>    {
>>>>        VFIOPCIDevice *vdev = opaque;
>>>> +    PCIDevice *dev = &vdev->pdev;
>>>> +    PCIEAERMsg msg = {
>>>> +        .severity = 0,
>>>> +        .source_id = (pci_bus_num(dev->bus) << 8) | dev->devfn,
>>>> +    };
>>>>    
>>>>        if (!event_notifier_test_and_clear(&vdev->err_notifier)) {
>>>>            return;
>>>>        }
>>>>    
>>>> +    /* we should read the error details from the real hardware
>>>> +     * configuration spaces, here we only need to do is signaling
>>>> +     * to guest an uncorrectable error has occurred.
>>>> +     */
>>> Inconsistent comment style
>>>
>>>> +     if(dev->exp.aer_cap) {
>>>            ^ space
>>>
>>>> +        uint8_t *aer_cap = dev->config + dev->exp.aer_cap;
>>>> +        uint32_t uncor_status;
>>>> +        bool isfatal;
>>>> +
>>>> +        uncor_status = vfio_pci_read_config(dev,
>>>> +                           dev->exp.aer_cap + PCI_ERR_UNCOR_STATUS, 4);
>>>> +
>>>> +        isfatal = uncor_status & pci_get_long(aer_cap + PCI_ERR_UNCOR_SEVER);
>>>> +
>>>> +        msg.severity = isfatal ? PCI_ERR_ROOT_CMD_FATAL_EN :
>>>> +                                 PCI_ERR_ROOT_CMD_NONFATAL_EN;
>>>> +
>>>> +        pcie_aer_msg(dev, &msg);
>>>> +        return;
>>>> +    }
>>>> +
>>>>        /*
>>>> -     * TBD. Retrieve the error details and decide what action
>>>> -     * needs to be taken. One of the actions could be to pass
>>>> -     * the error to the guest and have the guest driver recover
>>>> -     * from the error. This requires that PCIe capabilities be
>>>> -     * exposed to the guest. For now, we just terminate the
>>>> -     * guest to contain the error.
>>>> +     * If the aer capability is not exposed to the guest. we just
>>>> +     * terminate the guest to contain the error.
>>>>         */
>>>>    
>>>>        error_report("%s(%04x:%02x:%02x.%x) Unrecoverable error detected.  "
>>>
>>> .
>>>
>
>
> .
>

  reply	other threads:[~2015-03-16  7:42 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-12 10:23 [Qemu-devel] [PATCH v5 0/7] pass aer error to guest for vfio device Chen Fan
2015-03-12 10:23 ` [Qemu-devel] [PATCH v5 1/7] vfio: add pcie extanded capability support Chen Fan
2015-03-12 10:23 ` [Qemu-devel] [PATCH v5 2/7] aer: impove pcie_aer_init to support vfio device Chen Fan
2015-03-13 22:25   ` Alex Williamson
2015-03-16  2:30     ` Chen Fan
2015-03-12 10:23 ` [Qemu-devel] [PATCH v5 3/7] vfio: add aer support for " Chen Fan
2015-03-13 22:28   ` Alex Williamson
2015-03-12 10:23 ` [Qemu-devel] [PATCH v5 4/7] pcie_aer: expose pcie_aer_msg() interface Chen Fan
2015-03-13 22:30   ` Alex Williamson
2015-03-18 13:29   ` Michael S. Tsirkin
2015-03-19  1:33     ` Chen Fan
2015-03-12 10:23 ` [Qemu-devel] [PATCH v5 5/7] vfio-pci: pass the aer error to guest Chen Fan
2015-03-13 22:34   ` Alex Williamson
2015-03-16  3:05     ` Chen Fan
2015-03-16  3:52       ` Alex Williamson
2015-03-16  7:35         ` Chen Fan [this message]
2015-03-16 14:09           ` Alex Williamson
2015-03-25  1:33             ` Chen Fan
2015-03-25  2:31               ` Alex Williamson
2015-03-25  1:53             ` Chen Fan
2015-03-25  2:41               ` Alex Williamson
2015-03-25  3:07                 ` Chen Fan
2015-04-01  4:12                 ` Chen Fan
2015-04-01 15:46                   ` Alex Williamson
2015-04-08  8:59                     ` Chen Fan
2015-04-08 15:36                       ` Alex Williamson
2015-04-15 10:30                         ` Chen Fan
2015-04-15 14:18                           ` Alex Williamson
2015-03-12 10:23 ` [Qemu-devel] [PATCH v5 6/7] vfio: add 'x-aer' property to expose aercap Chen Fan
2015-03-18 13:23   ` Michael S. Tsirkin
2015-03-18 14:09     ` Alex Williamson
2015-03-12 10:23 ` [Qemu-devel] [PATCH v5 7/7] pc: add PC_I440FX_COMPAT to disable aercap for vifo device Chen Fan
2015-03-13 22:38   ` Alex Williamson
2015-03-16  2:48     ` Chen Fan
2015-03-16  2:49     ` Chen Fan
2015-03-18 13:23   ` Michael S. Tsirkin
2015-03-18 14:02     ` Alex Williamson
2015-03-18 14:05       ` Michael S. Tsirkin
2015-03-18 14:15         ` Alex Williamson
2015-03-18 14:36           ` Michael S. Tsirkin
2015-03-18 14:50             ` Alex Williamson
2015-03-18 15:02               ` Michael S. Tsirkin
2015-03-18 15:45                 ` Alex Williamson
2015-03-18 16:44                   ` Michael S. Tsirkin
2015-03-18 17:11                     ` Alex Williamson
2015-03-18 17:45                       ` Michael S. Tsirkin
2015-03-18 18:08                         ` Alex Williamson
2015-03-18 18:56                           ` Michael S. Tsirkin
2015-03-18 19:05                             ` Alex Williamson
2015-03-19 21:26                               ` Paolo Bonzini
2015-03-16  2:52 ` [Qemu-devel] [PATCH v5 0/7] pass aer error to guest for vfio device Chen Fan
2015-03-16  4:57   ` Michael S. Tsirkin
2015-03-19 21:44     ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=550687B1.7020504@cn.fujitsu.com \
    --to=chen.fan.fnst@cn.fujitsu.com \
    --cc=alex.williamson@redhat.com \
    --cc=izumi.taku@jp.fujitsu.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.