From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55638) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bNAJM-00057L-Pz for qemu-devel@nongnu.org; Tue, 12 Jul 2016 22:54:10 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bNAJJ-00071I-Jh for qemu-devel@nongnu.org; Tue, 12 Jul 2016 22:54:08 -0400 Received: from mx1.redhat.com ([209.132.183.28]:34211) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bNAJJ-0006zP-B5 for qemu-devel@nongnu.org; Tue, 12 Jul 2016 22:54:05 -0400 Date: Tue, 12 Jul 2016 20:54:02 -0600 From: Alex Williamson Message-ID: <20160712205402.3cff454f@t450s.home> In-Reply-To: <6ed2c694-bb43-d18f-dd7d-c749eddeed5d@cn.fujitsu.com> References: <1464315131-25834-1-git-send-email-zhoujie2011@cn.fujitsu.com> <20160627215808.1531a774@t450s.home> <7912dad0-0e37-603d-fdfe-bb4950b55f28@cn.fujitsu.com> <20160628084052.1e85a730@t450s.home> <689ac38f-96d7-9717-e9c4-d9926272cb86@cn.fujitsu.com> <20160629122242.2ac20254@t450s.home> <99e88b46-8b8b-f7de-700f-8d644a7f005a@cn.fujitsu.com> <20160705110300.029985a6@t450s.home> <24876827-07fc-6f69-6949-7e358720c914@cn.fujitsu.com> <20160707130442.7ac934c8@t450s.home> <3e661554-3c79-9d8f-41b2-c4cdd0fcb95f@cn.fujitsu.com> <20160708113358.2e5ed040@t450s.home> <20160711102456.0baa5e26@t450s.home> <20160712094543.422ce7f7@t450s.home> <6ed2c694-bb43-d18f-dd7d-c749eddeed5d@cn.fujitsu.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v8 11/12] vfio: register aer resume notification handler for aer resume List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Zhou Jie Cc: izumi.taku@jp.fujitsu.com, caoj.fnst@cn.fujitsu.com, Chen Fan , qemu-devel@nongnu.org, mst@redhat.com On Wed, 13 Jul 2016 09:04:16 +0800 Zhou Jie wrote: > Hi Alex, > > >> I will use workable state support flag > >> to let user know whether the kenerl support block feature. > >> And make configure space writing and ioctl function blocked. > > > > And what of my suggestion that a user may desire to poll the state of > > the device? > I will also add a poll function to vfio_fops. Can you explain how this will work? I was only suggesting that one of the flag bits in vfio_device_info be allocated to report the current state of blocking and the user could poll by repeatedly calling the DEVICE_INFO ioctl. Are you thinking of using POLLOUT/POLLIN? I'm not sure if those are a perfect match since it's really only the PCI config region and a few ioctls where access is blocked, other operations may proceed normally. > > A user does know what the vfio driver has done if you define the > > behavior that on an AER error reported event, as signaled to the user > > via the error notification interrupt, vfio-pci will teardown device > > interrupts to an uninitialized state. The difference between the > > command register approach you suggest and the teardown I suggest is > > that the command register is simply masking interrupt deliver while the > > teardown approach returns the device to an uninitialized interrupt > > state. Take a look at the device state when a bus reset occurs, what > > state is saved and restored and what is left at a default PCI value. > > The command register is saved and restored, so any manipulation we do > > of it is racing the host kernel AER handling and bus reset. What about > > MSI and MSI-X? Looks to me like those are left at the PCI default > > initialization state, so now after an AER error we have irq handlers > > and eventfds configured, while in fact the device has been > > de-programmed. To handle that we're expecting users to teardown the > > interrupt state and re-establish it? Again, why not just teardown the > > interrupt state ourselves? I dont' see the value in simply masking the > > command register, especially when it doesn't handle the no-DisINTx case. > I understand. > Thank you very much to explain this to me. > I will teardown the interrupt state. > > > We cannot depend on the behavior of any given driver and the fact that > > the guest driver may teardown interrupts anyway is not a justification > > that vfio shouldn't be doing this to make the device state presented to > > the user consistent. Thanks, > I understand. > > The following code will be modified. > 1. vfio_pci_ioctl > add a flag in vfio_device_info for workable_state support > 2. vfio_pci_ioctl > During err occurs and resume: > if (cmd == VFIO_DEVICE_SET_IRQS || VFIO_DEVICE_RESET > || VFIO_DEVICE_GET_PCI_HOT_RESET_INFO || VFIO_DEVICE_PCI_HOT_RESET) > block for workable_state clearing > 3. vfio_pci_write > During err occurs and resume: > block write to configure space > 4. vfio_pci_aer_err_detected > Set workable_state to false in "struct vfio_pci_device" > teardown the interrupt > 5. vfio_pci_aer_resume > Set workable_state to true in "struct vfio_pci_device" > 6. vfio_fops > Add poll function I would still suggest that the name "workable_state" is quite vague. Something like aer_error_in_progress is much more specific. Thanks, Alex