From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:47382) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1X0aJd-00028E-O5 for qemu-devel@nongnu.org; Fri, 27 Jun 2014 13:52:07 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1X0aJX-0006WH-OK for qemu-devel@nongnu.org; Fri, 27 Jun 2014 13:52:01 -0400 Date: Fri, 27 Jun 2014 13:51:42 -0400 (EDT) From: Paolo Bonzini Message-ID: <1081899261.33608246.1403891502613.JavaMail.zimbra@redhat.com> In-Reply-To: <1403886899.31091.187.camel@ul30vt.home> References: <1403879569-24256-1-git-send-email-pbonzini@redhat.com> <1403886899.31091.187.camel@ul30vt.home> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH for 2.1] vfio: use correct runstate List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alex Williamson Cc: qemu-trivial@nongnu.org, qemu-devel@nongnu.org ----- Messaggio originale ----- > Da: "Alex Williamson" > A: "Paolo Bonzini" > Cc: qemu-devel@nongnu.org, qemu-trivial@nongnu.org > Inviato: Venerd=C3=AC, 27 giugno 2014 18:34:59 > Oggetto: Re: [PATCH for 2.1] vfio: use correct runstate >=20 > On Fri, 2014-06-27 at 16:32 +0200, Paolo Bonzini wrote: > > io-error is for block device errors; it should always be preceded > > by a BLOCK_IO_ERROR event. >=20 > Where does this requirement come from? I only see a loose association > of IO_ERROR to disk in libvirt and none in QEMU. See the RunState enum in qapi-schema.json: ## # @RunState # # An enumeration of VM run states. # # ... # # @internal-error: An internal error that prevents further guest execution # has occurred # # @io-error: the last IOP has failed and the device is configured to pause # on I/O errors # # @paused: guest has been paused via the 'stop' command The point of io-error is that management can look at block devices, see if any have an error reported, and then resume execution (see documentation of rerror=3Dstop and werror=3Dstop/enospc). This is counter to the intentions= you have in vfio. > > I think vfio wants to use > > RUN_STATE_INTERNAL_ERROR instead. >=20 > But that seems to put us into an "unknown" paused state in libvirt. I think paused is incorrect, because (unlike RUN_STATE_IO_ERROR), you canno= t resume from RUN_STATE_INTERNAL_ERROR except with a reset. QEMU enforces th= at, and this matches the error you are reporting: error_report("%s(%04x:%02x:%02x.%x) Unrecoverable error detected. " "Please collect any data possible and then kill the guest"= , __func__, vdev->host.domain, vdev->host.bus, vdev->host.slot, vdev->host.function); libvirt has a crashed state, I think that's what libvirt should call the internal-error runstate. IIRC on Xen you get to crashed when the processor raises an error on vmentry, for example. Libvirt only knows about crashed/unknown, but one could add crashed/interna= l-error too. Paolo