From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:50443) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YabmD-0001In-Vu for qemu-devel@nongnu.org; Tue, 24 Mar 2015 23:14:43 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Yabm9-0007Hh-Tn for qemu-devel@nongnu.org; Tue, 24 Mar 2015 23:14:41 -0400 Received: from [59.151.112.132] (port=1863 helo=heian.cn.fujitsu.com) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Yabm9-0007Gy-7K for qemu-devel@nongnu.org; Tue, 24 Mar 2015 23:14:37 -0400 Message-ID: <55122678.3010808@cn.fujitsu.com> Date: Wed, 25 Mar 2015 11:07:36 +0800 From: Chen Fan MIME-Version: 1.0 References: <3c81eaae84d6b1fa6e229e765a534fdf180e1ce4.1426155432.git.chen.fan.fnst@cn.fujitsu.com> <1426286084.3643.144.camel@redhat.com> <55064870.6040209@cn.fujitsu.com> <1426477927.3643.160.camel@redhat.com> <550687B1.7020504@cn.fujitsu.com> <1426514950.3643.169.camel@redhat.com> <55121525.2040408@cn.fujitsu.com> <1427251289.3643.829.camel@redhat.com> In-Reply-To: <1427251289.3643.829.camel@redhat.com> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v5 5/7] vfio-pci: pass the aer error to guest List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alex Williamson Cc: izumi.taku@jp.fujitsu.com, qemu-devel@nongnu.org On 03/25/2015 10:41 AM, Alex Williamson wrote: > On Wed, 2015-03-25 at 09:53 +0800, Chen Fan wrote: >> On 03/16/2015 10:09 PM, Alex Williamson wrote: >>> On Mon, 2015-03-16 at 15:35 +0800, Chen Fan wrote: >>>> On 03/16/2015 11:52 AM, Alex Williamson wrote: >>>>> On Mon, 2015-03-16 at 11:05 +0800, Chen Fan wrote: >>>>>> On 03/14/2015 06:34 AM, Alex Williamson wrote: >>>>>>> On Thu, 2015-03-12 at 18:23 +0800, Chen Fan wrote: >>>>>>>> when the vfio device encounters an uncorrectable error in host, >>>>>>>> the vfio_pci driver will signal the eventfd registered by this >>>>>>>> vfio device, the results in the qemu eventfd handler getting >>>>>>>> invoked. >>>>>>>> >>>>>>>> this patch is to pass the error to guest and have the guest driver >>>>>>>> recover from the error. >>>>>>> What is going to be the typical recovery mechanism for the guest? I'm >>>>>>> concerned that the topology of the device in the guest doesn't >>>>>>> necessarily match the topology of the device in the host, so if the >>>>>>> guest were to attempt a bus reset to recover a device, for instance, >>>>>>> what happens? >>>>>> the recovery mechanism is that when guest got an aer error from a device, >>>>>> guest will clean the corresponding status bit in device register. and for >>>>>> need reset device, the guest aer driver would reset all devices under bus. >>>>> Sorry, I'm still confused, how does the guest aer driver reset all >>>>> devices under a bus? Are we talking about function-level, device >>>>> specific reset mechanisms or secondary bus resets? If the guest is >>>>> performing secondary bus resets, what guarantee do they have that it >>>>> will translate to a physical secondary bus reset? vfio may only do an >>>>> FLR when the bus is reset or it may not be able to do anything depending >>>>> on the available function-level resets and physical and virtual topology >>>>> of the device. Thanks, >>>> in general, functions depends on the corresponding device driver behaviors >>>> to do the recovery. e.g: implemented the error_detect, slot_reset callbacks. >>>> and for link reset, it usually do secondary bus reset. >>>> >>>> and do we must require to the physical secondary bus reset for vfio device >>>> as bus reset? >>> That depends on how the guest driver attempts recovery, doesn't it? >>> There are only a very limited number of cases where a secondary bus >>> reset initiated by the guest will translate to a secondary bus reset of >>> the physical device (iirc, single function device without FLR). In most >>> cases, it will at best be translated to an FLR. VFIO really only does >>> bus resets on VM reset because that's the only time we know that it's ok >>> to reset multiple devices. If the guest driver is depending on a >>> secondary bus reset to put the device into a recoverable state and we're >>> not able to provide that, then we're actually reducing containment of >>> the error by exposing AER to the guest and allowing it to attempt >>> recovery. So in practice, I'm afraid we're risking the integrity of the >>> VM by exposing AER to the guest and making it think that it can perform >>> recovery operations that are not effective. Thanks, >> I also have seen that if device without FLR, it seems can do hot reset >> by ioctl VFIO_DEVICE_PCI_HOT_RESET to reset the physical slot or bus >> in vfio_pci_reset. does it satisfy the recovery issues that you said? > The hot reset interface can only be used when a) the user (QEMU) owns > all of the devices on the bus and b) we know we're resetting all of the > devices. That mostly limits its use to VM reset. I think that on a > secondary bus reset, we don't know the scope of the reset at the QEMU > vfio driver, so we only make use of reset methods with a function-level > scope. That would only result in a secondary bus reset if that's the > reset mechanism used by the host kernel's PCI code (pci_reset_function), > which is limited to single function devices on a secondary bus, with no > other reset mechanisms. The host reset is also only available in some > configurations, for instance if we have a dual-port NIC where each > function is a separate IOMMU group, then we clearly cannot do a hot > reset unless both functions are assigned to the same VM _and_ appear to > the guest on the same virtual bus. So even if we could know the scope > of the reset in the QEMU vfio driver, we can only make use of it under > very strict guest configurations. Thanks, it seems difficult to allow guest to participate in recovery. but I think that we might be able to capture the vfio_pci_reset result. if vfio device reset fail. then we stop the VM. Thanks, Chen > > Alex > > . >