From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:43112) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b5URK-00065p-5T for qemu-devel@nongnu.org; Wed, 25 May 2016 04:45:23 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1b5URG-0002yr-VY for qemu-devel@nongnu.org; Wed, 25 May 2016 04:45:18 -0400 Received: from mx1.redhat.com ([209.132.183.28]:37064) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b5URG-0002yk-Mj for qemu-devel@nongnu.org; Wed, 25 May 2016 04:45:14 -0400 Date: Wed, 25 May 2016 11:45:11 +0300 From: "Michael S. Tsirkin" Message-ID: <20160525113623-mutt-send-email-mst@redhat.com> References: <1459856523-17085-1-git-send-email-caoj.fnst@cn.fujitsu.com> <1459856523-17085-12-git-send-email-caoj.fnst@cn.fujitsu.com> <20160411153827.3884ded1@t450s.home> <570EEC42.3040300@cn.fujitsu.com> <571EE2D6.4000100@cn.fujitsu.com> <20160426084815.24ec5200@t450s.home> <20160524134742-mutt-send-email-mst@redhat.com> <20160524205406.6dabaf71@ul30vt.home> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160524205406.6dabaf71@ul30vt.home> Subject: Re: [Qemu-devel] [patch v6 11/12] vfio: register aer resume notification handler for aer resume List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alex Williamson Cc: Chen Fan , Cao jin , izumi.taku@jp.fujitsu.com, qemu-devel@nongnu.org On Tue, May 24, 2016 at 08:54:06PM -0600, Alex Williamson wrote: > On Tue, 24 May 2016 13:49:12 +0300 > "Michael S. Tsirkin" wrote: > > > On Tue, Apr 26, 2016 at 08:48:15AM -0600, Alex Williamson wrote: > > > I think that means that if we want to switch from a > > > simple halt-on-error to a mechanism for the guest to handle recovery, > > > we need to disable access to the device between being notified that the > > > error occurred and being notified to resume. > > > > But this isn't what happens on bare metal. > > Errors are reported asynchronously and host might access the device > > meanwhile. These accesses might or might not trigger more errors, but > > fundamentally this should not matter too much as device is going to be > > reset. > > Bare metal also doesn't have a hypervisor underneath performing a PCI > bus reset, This is where I get lost. I assumed we do reset when guest requests it. Isn't that the case? Why not? > there's only one OS trying to control the device at a time, > so we have some clear differences from bare metal that I don't know we > can avoid. The thought here was that we need to notify the guest at the > earliest point we can, but let the host recovery run to completion > before allowing the user to interact with the device. Perhaps there is > no need to block region access to the device (ie. config space & BAR > resources), but I think we do need to somehow synchronize the bus resets > or else we get situations like that observed previously where the bus is > still in reset while userspace trys to proceed with using it. > Why do we have to trigger reset upon an error? Why not wait for guest to request reset? > The next question then would be whether that's QEMU's job or something > that should be done in the host kernel. It's been proposed to add yet > another eventfd for the kernel vfio-pci to signal QEMU when a resume > notification has occured, but perhaps the better approach would be for > the hot reset ioctl (and base reset ioctl) to handle this situation more > transparently. We could immediately return -EAGAIN and allow QEMU to > delay itself for any reset ioctl received after the AER error detected > event, but before the resume event. We could also allow some sort of > timeout, that the ioctl might enter an interruptible sleep, woken on > the resume notification or timeout. That sounds a bit better to me as > the specification of what's allowed between the error detected > notification and the resume notification is otherwise pretty poorly > defined. So if guest started reset, it might take a while for device to come out of that state, and access during this time might trigger errors. But that's already possible for guest to trigger, right? How is this different? > Do you think we can run completely asynchronous, letting the > host and guest bus resets race? Thanks, > > Alex I have a feeling we need to put some code out, disabled by default, and see how it behaves in the field. For example ability to trigger UR errors seems benign but I think we are trying to prevent them now because of something we saw in the field. -- MST