From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:60196) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QSD5W-0000UA-18 for qemu-devel@nongnu.org; Thu, 02 Jun 2011 14:57:47 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QSD5U-0000ev-Jd for qemu-devel@nongnu.org; Thu, 02 Jun 2011 14:57:45 -0400 Received: from mx1.redhat.com ([209.132.183.28]:61835) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QSD5U-0000eo-66 for qemu-devel@nongnu.org; Thu, 02 Jun 2011 14:57:44 -0400 Date: Thu, 2 Jun 2011 15:57:37 -0300 From: Luiz Capitulino Message-ID: <20110602155737.13f48a46@doriath> In-Reply-To: <4DE7D739.6080607@codemonkey.ws> References: <20110601181255.077fb5fd@doriath> <4DE6B087.6010708@codemonkey.ws> <20110602090632.GB14571@redhat.com> <4DE78B53.1010201@codemonkey.ws> <20110602132405.GJ514380@orkuz.home> <4DE797F6.2060004@codemonkey.ws> <20110602150124.0b3c187f@doriath> <4DE7D739.6080607@codemonkey.ws> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] QMP: RFC: I/O error info & query-stop-reason List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Anthony Liguori Cc: Kevin Wolf , Stefan Hajnoczi , Jiri Denemark , qemu-devel@nongnu.org, Markus Armbruster On Thu, 02 Jun 2011 13:32:25 -0500 Anthony Liguori wrote: > On 06/02/2011 01:01 PM, Luiz Capitulino wrote: > > On Thu, 02 Jun 2011 09:02:30 -0500 > > Anthony Liguori wrote: > > > >> On 06/02/2011 08:24 AM, Jiri Denemark wrote: > >>> On Thu, Jun 02, 2011 at 08:08:35 -0500, Anthony Liguori wrote: > >>>> On 06/02/2011 04:06 AM, Daniel P. Berrange wrote: > >>>>>>> B. query-stop-reason > >>>>>>> -------------------- > >>>>>>> > >>>>>>> I also have a simple solution for item 2. The vm_stop() accepts a reason > >>>>>>> argument, so we could store it somewhere and return it as a string, like: > >>>>>>> > >>>>>>> -> { "execute": "query-stop-reason" } > >>>>>>> <- { "return": { "reason": "user" } } > >>>>>>> > >>>>>>> Valid reasons could be: "user", "debug", "shutdown", "diskfull" (hey, > >>>>>>> this should be "ioerror", no?), "watchdog", "panic", "savevm", "loadvm", > >>>>>>> "migrate". > >>>>>>> > >>>>>>> Also note that we have a STOP event. It should be extended with the > >>>>>>> stop reason too, for completeness. > >>>>>> > >>>>>> > >>>>>> Can we just extend query-block? > >>>>> > >>>>> Primarily we want 'query-stop-reason' to tell us what caused the VM > >>>>> CPUs to stop. If that reason was 'ioerror', then 'query-block' could > >>>>> be used to find out which particular block device(s) caused the IO > >>>>> error to occurr& get the "reason" that was in the BLOCK_IO_ERROR > >>>>> event. > >>>> > >>>> My concern is that we're over abstracting here. We're not going to add > >>>> additional stop reasons in the future. > >>>> > >>>> Maybe just add an 'io-error': True to query-state. > >>> > >>> Sure, adding a new field to query-state response would work as well. And it > >>> seems like a good idea to me since one already needs to call query-status to > >>> check if CPUs are stopped or not so it makes sense to incorporate the > >>> additional information there as well. And if you want to be safe for the > >>> future, the new field doesn't have to be boolean 'io-error' but it can be the > >>> string 'reason' which Luiz suggested above. > >> > >> > >> String enumerations are a Bad Thing. It's impossible to figure out what > >> strings are valid and it lacks type safety. > >> > >> Adding more booleans provides better type safety, and when we move to > >> QAPI with a queryable schema, provides a way to figure out exactly what > >> combinations are supported by QEMU. > > > > To summarize: > > > > 1. Add a 'io-error' field to query-status (which is only present if > > field 'running' is false) > > It may or may not be present. Lack of presence does not tell you anything. > > It is only true when running is false AND the guest was stopped because > of an io error. Right. > > > > 2. Extend query-block to contain error information associated with the > > device. This is interesting, because this information will be available > > even if the error didn't cause the VM to stop > > Well we need at least some way to indicate that a block device is in a > failed state. For instance, if you have two block device, but you miss > the IO_ERROR event, you need to figure out which of the two devices is > giving errors. Can't query-block be used for that? The 'io-error' key will only be present for the failing device(s). > > But I was thinking of something that had the semantics of, last_iop_failed. > > Regards, > > Anthony Liguori > > > Seems good enough to me, comments? > > >