qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Paolo Bonzini <pbonzini@redhat.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: pkrempa@redhat.com, marcel.a@redhat.com, libvir-list@redhat.com,
	hutao@cn.fujitsu.com, qemu-stable@nongnu.org,
	qemu-devel@nongnu.org, lcapitulino@redhat.com, rhod@redhat.com,
	kraxel@redhat.com, anthony@codemonkey.ws, lersek@redhat.com,
	afaerber@suse.de
Subject: Re: [Qemu-devel] [PATCH] vl: allow "cont" from panicked state
Date: Wed, 21 Aug 2013 15:46:08 +0200	[thread overview]
Message-ID: <5214C4A0.6040303@redhat.com> (raw)
In-Reply-To: <20130821133029.GC7872@redhat.com>

Il 21/08/2013 15:30, Michael S. Tsirkin ha scritto:
> On Wed, Aug 21, 2013 at 02:01:17PM +0200, Paolo Bonzini wrote:
>> After reporting the GUEST_PANICKED monitor event, QEMU stops the VM.
>> The reason for this is that events are edge-triggered, and can be lost if
>> management dies at the wrong time.  Stopping a panicked VM lets management
>> know of a panic even if it has crashed; management can learn about the
>> panic when it restarts and queries running QEMU processes.  The downside
>> is of course that the VM will be paused while management is not running,
>> but that is acceptable if it only happens with explicit "-device pvpanic".
>>
>> Upon learning of a panic, management (if configured to do so) can pick a
>> variety of behaviors: leave the VM paused, reset it, destroy it.  In
>> addition to all of these behaviors, it is possible dumping the VM core
>> from the host.
>>
>> However, right now, the panicked state is irreversible, and can only be
>> exited by resetting the machine.  This means that any policy decision
>> is entirely in the hands of the host.  In particular there is no way to
>> use the "reboot on panic" option together with pvpanic.
>>
>> This patch makes the panicked state reversible (and removes various
>> workarounds that were there because of the state being irreversible).
>> With this change, management has a wider set of possible policies: it
>> can just log the crash and leave policy to the guest, it can leave the
>> VM paused.  In particular, the "log the crash and continue" is implemented
>> simply by sending a "cont" as soon as management learns about the panic.
>> Management could also implement the "irreversible paused state" itself.
>> And again, all such actions can be coupled with dumping the VM core.
>>
>> Unfortunately we cannot change the behavior of 1.6.0.  Thus, even if
>> it uses "-device pvpanic", management should check for "cont" failures.
>> If "cont" fails, management can then log that the VM remained paused
>> and urge the administrator to update QEMU.
>>
>> I suggest that this patch be included in an 1.6.1 release as soon as
>> possible, and perhaps in the 1.5 branch too.
>>
>> Cc: qemu-stable@nongnu.org
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> 
> OK this does sound reasonable, but it looks like current behaviour
> was intentional

Yes, it was intentional and it also sounded reasonable at the time.  The
gdbstub.c hack definitely should have raised a warning sign, though.

I suspect it was done this way simply because Xen has a "crashed" state
that behaves exactly like that, with policy determined exclusively by
the host.  And it has the same problems, in fact, even though I never
heard anyone complain about it for some weird reason...

With this patch, the same thing can be implemented at the libvirt level,
and the GUEST_PANICKED runstate now matches the semantics of watchdogs
too, so this solution is not only easier to use, but also more
consistent with the rest of QEMU.

Paolo

, so I wonder why was it put in place.
> Any idea?
> 
>> ---
>>  gdbstub.c | 3 ---
>>  vl.c      | 6 ++----
>>  2 files changed, 2 insertions(+), 7 deletions(-)
>>
>> diff --git a/gdbstub.c b/gdbstub.c
>> index 35ca7c2..747e67d 100644
>> --- a/gdbstub.c
>> +++ b/gdbstub.c
>> @@ -372,9 +372,6 @@ static inline void gdb_continue(GDBState *s)
>>  #ifdef CONFIG_USER_ONLY
>>      s->running_state = 1;
>>  #else
>> -    if (runstate_check(RUN_STATE_GUEST_PANICKED)) {
>> -        runstate_set(RUN_STATE_DEBUG);
>> -    }
>>      if (!runstate_needs_reset()) {
>>          vm_start();
>>      }
>> diff --git a/vl.c b/vl.c
>> index 25b8f2f..818d99e 100644
>> --- a/vl.c
>> +++ b/vl.c
>> @@ -637,9 +637,8 @@ static const RunStateTransition runstate_transitions_def[] = {
>>      { RUN_STATE_WATCHDOG, RUN_STATE_RUNNING },
>>      { RUN_STATE_WATCHDOG, RUN_STATE_FINISH_MIGRATE },
>>  
>> -    { RUN_STATE_GUEST_PANICKED, RUN_STATE_PAUSED },
>> +    { RUN_STATE_GUEST_PANICKED, RUN_STATE_RUNNING },
>>      { RUN_STATE_GUEST_PANICKED, RUN_STATE_FINISH_MIGRATE },
>> -    { RUN_STATE_GUEST_PANICKED, RUN_STATE_DEBUG },
>>  
>>      { RUN_STATE_MAX, RUN_STATE_MAX },
>>  };
>> @@ -685,8 +684,7 @@ int runstate_is_running(void)
>>  bool runstate_needs_reset(void)
>>  {
>>      return runstate_check(RUN_STATE_INTERNAL_ERROR) ||
>> -        runstate_check(RUN_STATE_SHUTDOWN) ||
>> -        runstate_check(RUN_STATE_GUEST_PANICKED);
>> +        runstate_check(RUN_STATE_SHUTDOWN);
>>  }
>>  
>>  StatusInfo *qmp_query_status(Error **errp)
>> -- 
>> 1.8.3.1
> 
> 

  reply	other threads:[~2013-08-21 13:47 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-21 12:01 [Qemu-devel] [PATCH] vl: allow "cont" from panicked state Paolo Bonzini
2013-08-21 12:42 ` Laszlo Ersek
2013-08-21 12:43   ` Paolo Bonzini
2013-08-21 14:17     ` Luiz Capitulino
2013-08-21 14:30       ` Michael S. Tsirkin
2013-08-21 14:37         ` Paolo Bonzini
2013-08-21 14:58           ` Michael S. Tsirkin
2013-08-21 15:07             ` Paolo Bonzini
2013-08-21 13:32   ` Michael S. Tsirkin
2013-08-21 13:30 ` Michael S. Tsirkin
2013-08-21 13:46   ` Paolo Bonzini [this message]
2013-08-21 14:11 ` Luiz Capitulino
2013-08-21 15:23 ` Eric Blake
2013-08-21 15:32   ` Paolo Bonzini
2013-08-21 15:44     ` Michael S. Tsirkin
2013-08-22  8:38     ` Laszlo Ersek
2013-08-22  9:19       ` Paolo Bonzini
2013-08-22  9:37         ` Michael S. Tsirkin
2013-08-22  9:52           ` Paolo Bonzini
2013-08-22 10:34         ` Laszlo Ersek
2013-08-22 10:36           ` Laszlo Ersek
2013-08-22 11:35           ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5214C4A0.6040303@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=afaerber@suse.de \
    --cc=anthony@codemonkey.ws \
    --cc=hutao@cn.fujitsu.com \
    --cc=kraxel@redhat.com \
    --cc=lcapitulino@redhat.com \
    --cc=lersek@redhat.com \
    --cc=libvir-list@redhat.com \
    --cc=marcel.a@redhat.com \
    --cc=mst@redhat.com \
    --cc=pkrempa@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-stable@nongnu.org \
    --cc=rhod@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).