From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1NeV2q-0006Lo-VD for qemu-devel@nongnu.org; Mon, 08 Feb 2010 09:57:01 -0500 Received: from [199.232.76.173] (port=41206 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NeV2q-0006Lb-Jb for qemu-devel@nongnu.org; Mon, 08 Feb 2010 09:57:00 -0500 Received: from Debian-exim by monty-python.gnu.org with spam-scanned (Exim 4.60) (envelope-from ) id 1NeV2o-0003ke-FD for qemu-devel@nongnu.org; Mon, 08 Feb 2010 09:57:00 -0500 Received: from mx1.redhat.com ([209.132.183.28]:23601) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1NeV2n-0003jp-PX for qemu-devel@nongnu.org; Mon, 08 Feb 2010 09:56:58 -0500 Date: Mon, 8 Feb 2010 14:56:53 +0000 From: "Daniel P. Berrange" Subject: Re: [Qemu-devel] Re: Two QMP events issues Message-ID: <20100208145653.GA25256@redhat.com> References: <20100208114145.4bd64349@doriath> <20100208141218.GG17328@redhat.com> <4B702470.5080401@codemonkey.ws> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4B702470.5080401@codemonkey.ws> Reply-To: "Daniel P. Berrange" List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Anthony Liguori Cc: armbru@redhat.com, qemu-devel@nongnu.org, Luiz Capitulino On Mon, Feb 08, 2010 at 08:49:20AM -0600, Anthony Liguori wrote: > On 02/08/2010 08:12 AM, Daniel P. Berrange wrote: > > > >For further backgrou, the key end goal here is that in a QMP client, upon > >receipt of the 'RESET' event, we need to reliably& immediately determine > >why it occurred. eg, triggered by watchdog, or by guest OS request. There > >are actually 3 possible sequences > > > > - WATCHDOG + action=reset, followed by RESET. Assuming no intervening > > event can occurr, the client can merely record 'WATCHDOG' and interpret > > it when it gets the immediately following 'RESET' event > > > > - RESET, followed by WATCHDOG + action=reset. The client doesn't know > > the reason for the RESET and can't wait arbitrarily for WATCHDOG since > > there might never be one arriving. > > > > - RESET + source=watchdog. Client directly sees the reason > > > >The second scenario is the one I'd like us to avoid at all costs, since it > >will require the client to introduce arbitrary delays in processing events > >to determine cause. The first is slightly inconvenient, but doable if we > >can assume no intervening events will occur, between WATCHDOG and the > >RESET events. The last is obviously simplest for the clients. > > > > I really prefer the third option but I'm a little concerned that we're > throwing events around somewhat haphazardly. > > So let me ask, why does a client need to determine when a guest reset > and why it reset? If a guest OS is repeatedly hanging/crashing resulting in the watchdog device firing, management software for the host really wants to know about that (so that appropriate alerts/action can be taken) and thus needs to be able to distinguish this from a "normal" guest OS initiated reboot. Regards, Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|