From: Luiz Capitulino <lcapitulino@redhat.com>
To: Anthony Liguori <anthony@codemonkey.ws>
Cc: qemu-devel@nongnu.org, armbru@redhat.com
Subject: Re: [Qemu-devel] Re: Two QMP events issues
Date: Mon, 8 Feb 2010 17:59:43 -0200 [thread overview]
Message-ID: <20100208175943.3ed2bf88@doriath> (raw)
In-Reply-To: <4B706290.7020104@codemonkey.ws>
On Mon, 08 Feb 2010 13:14:24 -0600
Anthony Liguori <anthony@codemonkey.ws> wrote:
> On 02/08/2010 12:25 PM, Luiz Capitulino wrote:
> > On Mon, 08 Feb 2010 09:13:37 -0600
> > Anthony Liguori<anthony@codemonkey.ws> wrote:
> >
> >
> >> On 02/08/2010 08:56 AM, Daniel P. Berrange wrote:
> >>
> >>> On Mon, Feb 08, 2010 at 08:49:20AM -0600, Anthony Liguori wrote:
> >>>
> >>>
> >>>> On 02/08/2010 08:12 AM, Daniel P. Berrange wrote:
> >>>>
> >>>>
> >>>>> For further backgrou, the key end goal here is that in a QMP client, upon
> >>>>> receipt of the 'RESET' event, we need to reliably& immediately determine
> >>>>> why it occurred. eg, triggered by watchdog, or by guest OS request. There
> >>>>> are actually 3 possible sequences
> >>>>>
> >>>>> - WATCHDOG + action=reset, followed by RESET. Assuming no intervening
> >>>>> event can occurr, the client can merely record 'WATCHDOG' and interpret
> >>>>> it when it gets the immediately following 'RESET' event
> >>>>>
> >>>>> - RESET, followed by WATCHDOG + action=reset. The client doesn't know
> >>>>> the reason for the RESET and can't wait arbitrarily for WATCHDOG since
> >>>>> there might never be one arriving.
> >>>>>
> >>>>> - RESET + source=watchdog. Client directly sees the reason
> >>>>>
> >>>>> The second scenario is the one I'd like us to avoid at all costs, since it
> >>>>> will require the client to introduce arbitrary delays in processing events
> >>>>> to determine cause. The first is slightly inconvenient, but doable if we
> >>>>> can assume no intervening events will occur, between WATCHDOG and the
> >>>>> RESET events. The last is obviously simplest for the clients.
> >>>>>
> >>>>>
> >>>>>
> >>>> I really prefer the third option but I'm a little concerned that we're
> >>>> throwing events around somewhat haphazardly.
> >>>>
> >>>> So let me ask, why does a client need to determine when a guest reset
> >>>> and why it reset?
> >>>>
> >>>>
> >>> If a guest OS is repeatedly hanging/crashing resulting in the watchdog
> >>> device firing, management software for the host really wants to know about
> >>> that (so that appropriate alerts/action can be taken) and thus needs to
> >>> be able to distinguish this from a "normal" guest OS initiated reboot.
> >>>
> >>>
> >> I think that's an argument for having the watchdog events independent of
> >> the reset events.
> >>
> >> The watchdog condition happening is not directly related to the action
> >> the watchdog takes. The watchdog event really belongs in a class events
> >> that are closely associated with a particular device emulation.
> >>
> >> In fact, I think what we're really missing in events today is a notion
> >> of a context. A RESET event is really a CPU event. A watchdog
> >> expiration event is a watchdog event. A connect event is a VNC event
> >> (Spice and chardevs will also generate connect events).
> >>
> > This could be done by adding a 'context' member to all the events and
> > then an event would have to be identified by the pair event_name:context.
> >
> > This way we can have the same event_name for events in different
> > contexts. For example:
> >
> > { 'event': DISCONNECT, 'context': 'spice', [...] }
> >
> > { 'event': DISCONNECT, 'context': 'vnc', [...] }
> >
> > Note that today we have VNC_DISCONNECT and will probably have
> > SPICE_DISCONNECT too.
> >
>
> Which is why we gave ourselves until 0.13 to straighten out the protocol.
Yeah.
> N.B. in this model, you'd have:
>
> { 'event' : 'EXPIRED', 'context': 'watchdog', 'action': 'reset' }
> /* some arbitrary number of events */
> { 'event' : 'RESET', 'context': 'cpu' }
>
> And the only reason RESET follows EXPIRED is because action=reset. If
> action was different, a RESET might not occur.
>
> A client needs to see the EXPIRED event, determine whether to expect a
> RESET event, and if so, wait for the next RESET event to happen.
Looks reasonable to me, what do think Daniel?
Note that if we agree on the 'context design', I'll have to change
VNC's events names..
next prev parent reply other threads:[~2010-02-08 19:59 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-02-08 13:41 [Qemu-devel] Two QMP events issues Luiz Capitulino
2010-02-08 14:12 ` [Qemu-devel] " Daniel P. Berrange
2010-02-08 14:49 ` Anthony Liguori
2010-02-08 14:56 ` Daniel P. Berrange
2010-02-08 15:13 ` Anthony Liguori
2010-02-08 18:25 ` Luiz Capitulino
2010-02-08 19:14 ` Anthony Liguori
2010-02-08 19:59 ` Luiz Capitulino [this message]
2010-02-08 20:22 ` Anthony Liguori
2010-02-08 18:19 ` Luiz Capitulino
2010-02-09 19:24 ` Jamie Lokier
2010-02-09 19:32 ` Jamie Lokier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100208175943.3ed2bf88@doriath \
--to=lcapitulino@redhat.com \
--cc=anthony@codemonkey.ws \
--cc=armbru@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).