qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: izumi.taku@jp.fujitsu.com, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH v5 5/7] vfio-pci: pass the aer error to guest
Date: Wed, 8 Apr 2015 16:59:50 +0800	[thread overview]
Message-ID: <5524EE06.8010107@cn.fujitsu.com> (raw)
In-Reply-To: <1427903200.5567.227.camel@redhat.com>


On 04/01/2015 11:46 PM, Alex Williamson wrote:
> On Wed, 2015-04-01 at 12:12 +0800, Chen Fan wrote:
>> On 03/25/2015 10:41 AM, Alex Williamson wrote:
>>> On Wed, 2015-03-25 at 09:53 +0800, Chen Fan wrote:
>>>> On 03/16/2015 10:09 PM, Alex Williamson wrote:
>>>>> On Mon, 2015-03-16 at 15:35 +0800, Chen Fan wrote:
>>>>>> On 03/16/2015 11:52 AM, Alex Williamson wrote:
>>>>>>> On Mon, 2015-03-16 at 11:05 +0800, Chen Fan wrote:
>>>>>>>> On 03/14/2015 06:34 AM, Alex Williamson wrote:
>>>>>>>>> On Thu, 2015-03-12 at 18:23 +0800, Chen Fan wrote:
>>>>>>>>>> when the vfio device encounters an uncorrectable error in host,
>>>>>>>>>> the vfio_pci driver will signal the eventfd registered by this
>>>>>>>>>> vfio device, the results in the qemu eventfd handler getting
>>>>>>>>>> invoked.
>>>>>>>>>>
>>>>>>>>>> this patch is to pass the error to guest and have the guest driver
>>>>>>>>>> recover from the error.
>>>>>>>>> What is going to be the typical recovery mechanism for the guest?  I'm
>>>>>>>>> concerned that the topology of the device in the guest doesn't
>>>>>>>>> necessarily match the topology of the device in the host, so if the
>>>>>>>>> guest were to attempt a bus reset to recover a device, for instance,
>>>>>>>>> what happens?
>>>>>>>> the recovery mechanism is that when guest got an aer error from a device,
>>>>>>>> guest will clean the corresponding status bit in device register. and for
>>>>>>>> need reset device, the guest aer driver would reset all devices under bus.
>>>>>>> Sorry, I'm still confused, how does the guest aer driver reset all
>>>>>>> devices under a bus?  Are we talking about function-level, device
>>>>>>> specific reset mechanisms or secondary bus resets?  If the guest is
>>>>>>> performing secondary bus resets, what guarantee do they have that it
>>>>>>> will translate to a physical secondary bus reset?  vfio may only do an
>>>>>>> FLR when the bus is reset or it may not be able to do anything depending
>>>>>>> on the available function-level resets and physical and virtual topology
>>>>>>> of the device.  Thanks,
>>>>>> in general, functions depends on the corresponding device driver behaviors
>>>>>> to do the recovery. e.g: implemented the error_detect, slot_reset callbacks.
>>>>>> and for link reset, it usually do secondary bus reset.
>>>>>>
>>>>>> and do we must require to the physical secondary bus reset for vfio device
>>>>>> as bus reset?
>>>>> That depends on how the guest driver attempts recovery, doesn't it?
>>>>> There are only a very limited number of cases where a secondary bus
>>>>> reset initiated by the guest will translate to a secondary bus reset of
>>>>> the physical device (iirc, single function device without FLR).  In most
>>>>> cases, it will at best be translated to an FLR.  VFIO really only does
>>>>> bus resets on VM reset because that's the only time we know that it's ok
>>>>> to reset multiple devices.  If the guest driver is depending on a
>>>>> secondary bus reset to put the device into a recoverable state and we're
>>>>> not able to provide that, then we're actually reducing containment of
>>>>> the error by exposing AER to the guest and allowing it to attempt
>>>>> recovery.  So in practice, I'm afraid we're risking the integrity of the
>>>>> VM by exposing AER to the guest and making it think that it can perform
>>>>> recovery operations that are not effective.  Thanks,
>>>> I also have seen that if device without FLR, it seems can do hot reset
>>>> by ioctl VFIO_DEVICE_PCI_HOT_RESET to reset the physical slot or bus
>>>> in vfio_pci_reset. does it satisfy the recovery issues that you said?
>>> The hot reset interface can only be used when a) the user (QEMU) owns
>>> all of the devices on the bus and b) we know we're resetting all of the
>>> devices.  That mostly limits its use to VM reset.  I think that on a
>>> secondary bus reset, we don't know the scope of the reset at the QEMU
>>> vfio driver, so we only make use of reset methods with a function-level
>>> scope.  That would only result in a secondary bus reset if that's the
>>> reset mechanism used by the host kernel's PCI code (pci_reset_function),
>>> which is limited to single function devices on a secondary bus, with no
>>> other reset mechanisms.  The host reset is also only available in some
>>> configurations, for instance if we have a dual-port NIC where each
>>> function is a separate IOMMU group, then we clearly cannot do a hot
>>> reset unless both functions are assigned to the same VM _and_ appear to
>>> the guest on the same virtual bus.  So even if we could know the scope
>>> of the reset in the QEMU vfio driver, we can only make use of it under
>>> very strict guest configurations.  Thanks,
>> Hi Alex,
>>
>>      have you some idea or scenario to fix/escape this issue?
> Hi Chen,
>
> I expect there are two major components to this.  The first is that
> QEMU/vfio-pci needs to enforce that a bus reset is possible for the host
> and guest topology when guest AER handling is specified for a device.
> That means that everything affected by the bus reset needs to be exposed
> to the guest in a compatible way.  For instance, if a bus reset affects
> devices from multiple groups, the guest needs to not only own all of
> those groups, but they also need to be exposed to the guest such that
> the virtual bus layout reflects the extent of the reset for the physical
> bus.  This also implies that guest AER handling cannot be the default
> since it will impose significant configuration restrictions on device
> assignment.
>
> This seems like a difficult configuration enforcement to make, but maybe
> there are simplifying assumptions that can help.  For instance the
> devices need to be exposed as PCIe therefore we won't have multiple
> slots in use on a bus and I think we can therefore mostly ignore hotplug
> since we can only hotplug at a slot granularity.  That may also imply
> that we should simply enforce a 1:1 mapping of physical functions to
> virtual functions.  At least one function from each group affected by a
> reset must be exposed to the guest.
>
> The second issue is that individual QEMU PCI devices have no callback
> for a bus reset.  QEMU/vfio-pci currently has the DeviceClass.reset
> callback, which we assume to be a function-level reset.  We also
> register with qemu_register_reset() for a VM reset, which is the only
> point currently that we know we can do a reset affecting multiple
> devices.  Infrastructure will need to be added to QEMU/PCI to expose the
> link down/RST signal to devices on a bus to trigger a multi-device reset
> in vfio-pci.
>
> Hopefully I'm not missing something, but I think both of those changes
> are going to be required before we can have anything remotely
> supportable for guest-based AER error handle.  This pretty complicated
> for the user and also for libvirt to figure out.  At a minimum libvirt
> would need to support a new guest-based AER handling flag for devices.
> We probably need to determine whether this is unique to vfio-pci or a
> generic PCIDevice option.  Thanks,

Hi Alex,
   Solving the two issues seem like a big workload. do we have a simple
   way to support qemu AER ?

Thanks,
Chen

>
> Alex
>
>
> .
>

  reply	other threads:[~2015-04-08 10:11 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-12 10:23 [Qemu-devel] [PATCH v5 0/7] pass aer error to guest for vfio device Chen Fan
2015-03-12 10:23 ` [Qemu-devel] [PATCH v5 1/7] vfio: add pcie extanded capability support Chen Fan
2015-03-12 10:23 ` [Qemu-devel] [PATCH v5 2/7] aer: impove pcie_aer_init to support vfio device Chen Fan
2015-03-13 22:25   ` Alex Williamson
2015-03-16  2:30     ` Chen Fan
2015-03-12 10:23 ` [Qemu-devel] [PATCH v5 3/7] vfio: add aer support for " Chen Fan
2015-03-13 22:28   ` Alex Williamson
2015-03-12 10:23 ` [Qemu-devel] [PATCH v5 4/7] pcie_aer: expose pcie_aer_msg() interface Chen Fan
2015-03-13 22:30   ` Alex Williamson
2015-03-18 13:29   ` Michael S. Tsirkin
2015-03-19  1:33     ` Chen Fan
2015-03-12 10:23 ` [Qemu-devel] [PATCH v5 5/7] vfio-pci: pass the aer error to guest Chen Fan
2015-03-13 22:34   ` Alex Williamson
2015-03-16  3:05     ` Chen Fan
2015-03-16  3:52       ` Alex Williamson
2015-03-16  7:35         ` Chen Fan
2015-03-16 14:09           ` Alex Williamson
2015-03-25  1:33             ` Chen Fan
2015-03-25  2:31               ` Alex Williamson
2015-03-25  1:53             ` Chen Fan
2015-03-25  2:41               ` Alex Williamson
2015-03-25  3:07                 ` Chen Fan
2015-04-01  4:12                 ` Chen Fan
2015-04-01 15:46                   ` Alex Williamson
2015-04-08  8:59                     ` Chen Fan [this message]
2015-04-08 15:36                       ` Alex Williamson
2015-04-15 10:30                         ` Chen Fan
2015-04-15 14:18                           ` Alex Williamson
2015-03-12 10:23 ` [Qemu-devel] [PATCH v5 6/7] vfio: add 'x-aer' property to expose aercap Chen Fan
2015-03-18 13:23   ` Michael S. Tsirkin
2015-03-18 14:09     ` Alex Williamson
2015-03-12 10:23 ` [Qemu-devel] [PATCH v5 7/7] pc: add PC_I440FX_COMPAT to disable aercap for vifo device Chen Fan
2015-03-13 22:38   ` Alex Williamson
2015-03-16  2:48     ` Chen Fan
2015-03-16  2:49     ` Chen Fan
2015-03-18 13:23   ` Michael S. Tsirkin
2015-03-18 14:02     ` Alex Williamson
2015-03-18 14:05       ` Michael S. Tsirkin
2015-03-18 14:15         ` Alex Williamson
2015-03-18 14:36           ` Michael S. Tsirkin
2015-03-18 14:50             ` Alex Williamson
2015-03-18 15:02               ` Michael S. Tsirkin
2015-03-18 15:45                 ` Alex Williamson
2015-03-18 16:44                   ` Michael S. Tsirkin
2015-03-18 17:11                     ` Alex Williamson
2015-03-18 17:45                       ` Michael S. Tsirkin
2015-03-18 18:08                         ` Alex Williamson
2015-03-18 18:56                           ` Michael S. Tsirkin
2015-03-18 19:05                             ` Alex Williamson
2015-03-19 21:26                               ` Paolo Bonzini
2015-03-16  2:52 ` [Qemu-devel] [PATCH v5 0/7] pass aer error to guest for vfio device Chen Fan
2015-03-16  4:57   ` Michael S. Tsirkin
2015-03-19 21:44     ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5524EE06.8010107@cn.fujitsu.com \
    --to=chen.fan.fnst@cn.fujitsu.com \
    --cc=alex.williamson@redhat.com \
    --cc=izumi.taku@jp.fujitsu.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).