[Qemu-devel] [PATCH] vl: allow "cont" from panicked state

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH] vl: allow "cont" from panicked state
@ 2013-08-21 12:01 Paolo Bonzini
  2013-08-21 12:42 ` Laszlo Ersek
                   ` (3 more replies)
  0 siblings, 4 replies; 22+ messages in thread
From: Paolo Bonzini @ 2013-08-21 12:01 UTC (permalink / raw)
  To: qemu-devel
  Cc: pkrempa, marcel.a, libvir-list, mst, qemu-stable, lcapitulino,
	rhod, kraxel, anthony, hutao, lersek, afaerber

After reporting the GUEST_PANICKED monitor event, QEMU stops the VM.
The reason for this is that events are edge-triggered, and can be lost if
management dies at the wrong time.  Stopping a panicked VM lets management
know of a panic even if it has crashed; management can learn about the
panic when it restarts and queries running QEMU processes.  The downside
is of course that the VM will be paused while management is not running,
but that is acceptable if it only happens with explicit "-device pvpanic".

Upon learning of a panic, management (if configured to do so) can pick a
variety of behaviors: leave the VM paused, reset it, destroy it.  In
addition to all of these behaviors, it is possible dumping the VM core
from the host.

However, right now, the panicked state is irreversible, and can only be
exited by resetting the machine.  This means that any policy decision
is entirely in the hands of the host.  In particular there is no way to
use the "reboot on panic" option together with pvpanic.

This patch makes the panicked state reversible (and removes various
workarounds that were there because of the state being irreversible).
With this change, management has a wider set of possible policies: it
can just log the crash and leave policy to the guest, it can leave the
VM paused.  In particular, the "log the crash and continue" is implemented
simply by sending a "cont" as soon as management learns about the panic.
Management could also implement the "irreversible paused state" itself.
And again, all such actions can be coupled with dumping the VM core.

Unfortunately we cannot change the behavior of 1.6.0.  Thus, even if
it uses "-device pvpanic", management should check for "cont" failures.
If "cont" fails, management can then log that the VM remained paused
and urge the administrator to update QEMU.

I suggest that this patch be included in an 1.6.1 release as soon as
possible, and perhaps in the 1.5 branch too.

Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 gdbstub.c | 3 ---
 vl.c      | 6 ++----
 2 files changed, 2 insertions(+), 7 deletions(-)

diff --git a/gdbstub.c b/gdbstub.c
index 35ca7c2..747e67d 100644
--- a/gdbstub.c
+++ b/gdbstub.c
@@ -372,9 +372,6 @@ static inline void gdb_continue(GDBState *s)
 #ifdef CONFIG_USER_ONLY
     s->running_state = 1;
 #else
-    if (runstate_check(RUN_STATE_GUEST_PANICKED)) {
-        runstate_set(RUN_STATE_DEBUG);
-    }
     if (!runstate_needs_reset()) {
         vm_start();
     }
diff --git a/vl.c b/vl.c
index 25b8f2f..818d99e 100644
--- a/vl.c
+++ b/vl.c
@@ -637,9 +637,8 @@ static const RunStateTransition runstate_transitions_def[] = {
     { RUN_STATE_WATCHDOG, RUN_STATE_RUNNING },
     { RUN_STATE_WATCHDOG, RUN_STATE_FINISH_MIGRATE },

-    { RUN_STATE_GUEST_PANICKED, RUN_STATE_PAUSED },
+    { RUN_STATE_GUEST_PANICKED, RUN_STATE_RUNNING },
     { RUN_STATE_GUEST_PANICKED, RUN_STATE_FINISH_MIGRATE },
-    { RUN_STATE_GUEST_PANICKED, RUN_STATE_DEBUG },

     { RUN_STATE_MAX, RUN_STATE_MAX },
 };
@@ -685,8 +684,7 @@ int runstate_is_running(void)
 bool runstate_needs_reset(void)
 {
     return runstate_check(RUN_STATE_INTERNAL_ERROR) ||
-        runstate_check(RUN_STATE_SHUTDOWN) ||
-        runstate_check(RUN_STATE_GUEST_PANICKED);
+        runstate_check(RUN_STATE_SHUTDOWN);
 }

 StatusInfo *qmp_query_status(Error **errp)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH] vl: allow "cont" from panicked state
  2013-08-21 12:01 [Qemu-devel] [PATCH] vl: allow "cont" from panicked state Paolo Bonzini
@ 2013-08-21 12:42 ` Laszlo Ersek
  2013-08-21 12:43   ` Paolo Bonzini
  2013-08-21 13:32   ` Michael S. Tsirkin
  2013-08-21 13:30 ` Michael S. Tsirkin
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 22+ messages in thread
From: Laszlo Ersek @ 2013-08-21 12:42 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: pkrempa, mst, libvir-list, hutao, marcel.a, qemu-stable,
	qemu-devel, rhod, kraxel, anthony, lcapitulino, afaerber

one question below

On 08/21/13 14:01, Paolo Bonzini wrote:
> After reporting the GUEST_PANICKED monitor event, QEMU stops the VM.
> The reason for this is that events are edge-triggered, and can be lost if
> management dies at the wrong time.  Stopping a panicked VM lets management
> know of a panic even if it has crashed; management can learn about the
> panic when it restarts and queries running QEMU processes.  The downside
> is of course that the VM will be paused while management is not running,
> but that is acceptable if it only happens with explicit "-device pvpanic".
> 
> Upon learning of a panic, management (if configured to do so) can pick a
> variety of behaviors: leave the VM paused, reset it, destroy it.  In
> addition to all of these behaviors, it is possible dumping the VM core
> from the host.
> 
> However, right now, the panicked state is irreversible, and can only be
> exited by resetting the machine.  This means that any policy decision
> is entirely in the hands of the host.  In particular there is no way to
> use the "reboot on panic" option together with pvpanic.
> 
> This patch makes the panicked state reversible (and removes various
> workarounds that were there because of the state being irreversible).
> With this change, management has a wider set of possible policies: it
> can just log the crash and leave policy to the guest, it can leave the
> VM paused.  In particular, the "log the crash and continue" is implemented
> simply by sending a "cont" as soon as management learns about the panic.
> Management could also implement the "irreversible paused state" itself.
> And again, all such actions can be coupled with dumping the VM core.
> 
> Unfortunately we cannot change the behavior of 1.6.0.  Thus, even if
> it uses "-device pvpanic", management should check for "cont" failures.
> If "cont" fails, management can then log that the VM remained paused
> and urge the administrator to update QEMU.
> 
> I suggest that this patch be included in an 1.6.1 release as soon as
> possible, and perhaps in the 1.5 branch too.
> 
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  gdbstub.c | 3 ---
>  vl.c      | 6 ++----
>  2 files changed, 2 insertions(+), 7 deletions(-)
> 
> diff --git a/gdbstub.c b/gdbstub.c
> index 35ca7c2..747e67d 100644
> --- a/gdbstub.c
> +++ b/gdbstub.c
> @@ -372,9 +372,6 @@ static inline void gdb_continue(GDBState *s)
>  #ifdef CONFIG_USER_ONLY
>      s->running_state = 1;
>  #else
> -    if (runstate_check(RUN_STATE_GUEST_PANICKED)) {
> -        runstate_set(RUN_STATE_DEBUG);
> -    }

Undoes bc7d0e66. Makes sense -- what we're allowing now is laxer
(includes the above).

>      if (!runstate_needs_reset()) {
>          vm_start();
>      }
> diff --git a/vl.c b/vl.c
> index 25b8f2f..818d99e 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -637,9 +637,8 @@ static const RunStateTransition runstate_transitions_def[] = {
>      { RUN_STATE_WATCHDOG, RUN_STATE_RUNNING },
>      { RUN_STATE_WATCHDOG, RUN_STATE_FINISH_MIGRATE },
>  
> -    { RUN_STATE_GUEST_PANICKED, RUN_STATE_PAUSED },

I don't understand why this used to be here.

So, why? (*)

> +    { RUN_STATE_GUEST_PANICKED, RUN_STATE_RUNNING },

This is the "cont" we care about -- it should allow the guest to kexec
or reboot (ie. the panic notifier will return).

>      { RUN_STATE_GUEST_PANICKED, RUN_STATE_FINISH_MIGRATE },
> -    { RUN_STATE_GUEST_PANICKED, RUN_STATE_DEBUG },

Undoes bc7d0e66. OK.

>  
>      { RUN_STATE_MAX, RUN_STATE_MAX },
>  };
> @@ -685,8 +684,7 @@ int runstate_is_running(void)
>  bool runstate_needs_reset(void)
>  {
>      return runstate_check(RUN_STATE_INTERNAL_ERROR) ||
> -        runstate_check(RUN_STATE_SHUTDOWN) ||
> -        runstate_check(RUN_STATE_GUEST_PANICKED);
> +        runstate_check(RUN_STATE_SHUTDOWN);
>  }
>  
>  StatusInfo *qmp_query_status(Error **errp)
> 

This last hunk in effect reverts the runstate_needs_reset() changes of
the initial pvpanic commit ede085b3.

(*) Hm I think I understand why. main_loop_should_exit(), when a reset
was requested *and* runstate_needs_reset() evaluated to true, used to
set the runstate to PAUSED -- I guess temporarily.

Since PANICKED was included in runstate_needs_reset(), this generic code
could request a transition from PANICKED to PAUSED (**). As PANICKED is
being removed from runstate_needs_reset(), the PANICKED->PAUSED
transition is not required any longer.

(**) I don't know why the generic code moves to PAUSED temporarily (from
INTERNAL_ERROR and SHUTDOWN), but I'll just accept that as status quo.

Reviewed-by: Laszlo Ersek <lersek@redhat.com>

(Note that my R-b is mostly worthless: similarly to the ACPI table move,
I've been happily acking patches with opposite goals here, and that
seriously questions whether my review adds any value (beyond the lowest
technical level).)

Laszlo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH] vl: allow "cont" from panicked state
  2013-08-21 12:42 ` Laszlo Ersek
@ 2013-08-21 12:43   ` Paolo Bonzini
  2013-08-21 14:17     ` Luiz Capitulino
  2013-08-21 13:32   ` Michael S. Tsirkin
  1 sibling, 1 reply; 22+ messages in thread
From: Paolo Bonzini @ 2013-08-21 12:43 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: pkrempa, marcel.a, libvir-list, hutao, mst, qemu-stable,
	qemu-devel, rhod, kraxel, anthony, lcapitulino, afaerber

Il 21/08/2013 14:42, Laszlo Ersek ha scritto:
> (*) Hm I think I understand why. main_loop_should_exit(), when a reset
> was requested *and* runstate_needs_reset() evaluated to true, used to
> set the runstate to PAUSED -- I guess temporarily.

Yes, this is the code that does the PANICKED -> PAUSED transition:

        if (runstate_needs_reset()) {
            runstate_set(RUN_STATE_PAUSED);
        }

This is to move the system out of a runstate that needs_reset(), and
make the subsequent "cont" work instead of hitting this:

    if (runstate_needs_reset()) {
        error_set(errp, QERR_RESET_REQUIRED);
        return;
    }

Paolo

> Since PANICKED was included in runstate_needs_reset(), this generic code
> could request a transition from PANICKED to PAUSED (**). As PANICKED is
> being removed from runstate_needs_reset(), the PANICKED->PAUSED
> transition is not required any longer.
> 
> (**) I don't know why the generic code moves to PAUSED temporarily (from
> INTERNAL_ERROR and SHUTDOWN), but I'll just accept that as status quo.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH] vl: allow "cont" from panicked state
  2013-08-21 12:01 [Qemu-devel] [PATCH] vl: allow "cont" from panicked state Paolo Bonzini
  2013-08-21 12:42 ` Laszlo Ersek
@ 2013-08-21 13:30 ` Michael S. Tsirkin
  2013-08-21 13:46   ` Paolo Bonzini
  2013-08-21 14:11 ` Luiz Capitulino
  2013-08-21 15:23 ` Eric Blake
  3 siblings, 1 reply; 22+ messages in thread
From: Michael S. Tsirkin @ 2013-08-21 13:30 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: pkrempa, marcel.a, libvir-list, qemu-stable, qemu-devel,
	lcapitulino, rhod, kraxel, anthony, hutao, lersek, afaerber

On Wed, Aug 21, 2013 at 02:01:17PM +0200, Paolo Bonzini wrote:
> After reporting the GUEST_PANICKED monitor event, QEMU stops the VM.
> The reason for this is that events are edge-triggered, and can be lost if
> management dies at the wrong time.  Stopping a panicked VM lets management
> know of a panic even if it has crashed; management can learn about the
> panic when it restarts and queries running QEMU processes.  The downside
> is of course that the VM will be paused while management is not running,
> but that is acceptable if it only happens with explicit "-device pvpanic".
> 
> Upon learning of a panic, management (if configured to do so) can pick a
> variety of behaviors: leave the VM paused, reset it, destroy it.  In
> addition to all of these behaviors, it is possible dumping the VM core
> from the host.
> 
> However, right now, the panicked state is irreversible, and can only be
> exited by resetting the machine.  This means that any policy decision
> is entirely in the hands of the host.  In particular there is no way to
> use the "reboot on panic" option together with pvpanic.
> 
> This patch makes the panicked state reversible (and removes various
> workarounds that were there because of the state being irreversible).
> With this change, management has a wider set of possible policies: it
> can just log the crash and leave policy to the guest, it can leave the
> VM paused.  In particular, the "log the crash and continue" is implemented
> simply by sending a "cont" as soon as management learns about the panic.
> Management could also implement the "irreversible paused state" itself.
> And again, all such actions can be coupled with dumping the VM core.
> 
> Unfortunately we cannot change the behavior of 1.6.0.  Thus, even if
> it uses "-device pvpanic", management should check for "cont" failures.
> If "cont" fails, management can then log that the VM remained paused
> and urge the administrator to update QEMU.
> 
> I suggest that this patch be included in an 1.6.1 release as soon as
> possible, and perhaps in the 1.5 branch too.
> 
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

OK this does sound reasonable, but it looks like current behaviour
was intentional, so I wonder why was it put in place.
Any idea?

> ---
>  gdbstub.c | 3 ---
>  vl.c      | 6 ++----
>  2 files changed, 2 insertions(+), 7 deletions(-)
> 
> diff --git a/gdbstub.c b/gdbstub.c
> index 35ca7c2..747e67d 100644
> --- a/gdbstub.c
> +++ b/gdbstub.c
> @@ -372,9 +372,6 @@ static inline void gdb_continue(GDBState *s)
>  #ifdef CONFIG_USER_ONLY
>      s->running_state = 1;
>  #else
> -    if (runstate_check(RUN_STATE_GUEST_PANICKED)) {
> -        runstate_set(RUN_STATE_DEBUG);
> -    }
>      if (!runstate_needs_reset()) {
>          vm_start();
>      }
> diff --git a/vl.c b/vl.c
> index 25b8f2f..818d99e 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -637,9 +637,8 @@ static const RunStateTransition runstate_transitions_def[] = {
>      { RUN_STATE_WATCHDOG, RUN_STATE_RUNNING },
>      { RUN_STATE_WATCHDOG, RUN_STATE_FINISH_MIGRATE },
>  
> -    { RUN_STATE_GUEST_PANICKED, RUN_STATE_PAUSED },
> +    { RUN_STATE_GUEST_PANICKED, RUN_STATE_RUNNING },
>      { RUN_STATE_GUEST_PANICKED, RUN_STATE_FINISH_MIGRATE },
> -    { RUN_STATE_GUEST_PANICKED, RUN_STATE_DEBUG },
>  
>      { RUN_STATE_MAX, RUN_STATE_MAX },
>  };
> @@ -685,8 +684,7 @@ int runstate_is_running(void)
>  bool runstate_needs_reset(void)
>  {
>      return runstate_check(RUN_STATE_INTERNAL_ERROR) ||
> -        runstate_check(RUN_STATE_SHUTDOWN) ||
> -        runstate_check(RUN_STATE_GUEST_PANICKED);
> +        runstate_check(RUN_STATE_SHUTDOWN);
>  }
>  
>  StatusInfo *qmp_query_status(Error **errp)
> -- 
> 1.8.3.1

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH] vl: allow "cont" from panicked state
  2013-08-21 12:42 ` Laszlo Ersek
  2013-08-21 12:43   ` Paolo Bonzini
@ 2013-08-21 13:32   ` Michael S. Tsirkin
  1 sibling, 0 replies; 22+ messages in thread
From: Michael S. Tsirkin @ 2013-08-21 13:32 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: pkrempa, marcel.a, libvir-list, hutao, qemu-stable, qemu-devel,
	rhod, kraxel, anthony, Paolo Bonzini, lcapitulino, afaerber

On Wed, Aug 21, 2013 at 02:42:38PM +0200, Laszlo Ersek wrote:
> Reviewed-by: Laszlo Ersek <lersek@redhat.com>
> 
> (Note that my R-b is mostly worthless: similarly to the ACPI table move,
> I've been happily acking patches with opposite goals here, and that
> seriously questions whether my review adds any value (beyond the lowest
> technical level).)
> 
> Laszlo

Your reviews were valuable.
Keep it up, pls don't feel discouraged :)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH] vl: allow "cont" from panicked state
  2013-08-21 13:30 ` Michael S. Tsirkin
@ 2013-08-21 13:46   ` Paolo Bonzini
  0 siblings, 0 replies; 22+ messages in thread
From: Paolo Bonzini @ 2013-08-21 13:46 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: pkrempa, marcel.a, libvir-list, hutao, qemu-stable, qemu-devel,
	lcapitulino, rhod, kraxel, anthony, lersek, afaerber

Il 21/08/2013 15:30, Michael S. Tsirkin ha scritto:
> On Wed, Aug 21, 2013 at 02:01:17PM +0200, Paolo Bonzini wrote:
>> After reporting the GUEST_PANICKED monitor event, QEMU stops the VM.
>> The reason for this is that events are edge-triggered, and can be lost if
>> management dies at the wrong time.  Stopping a panicked VM lets management
>> know of a panic even if it has crashed; management can learn about the
>> panic when it restarts and queries running QEMU processes.  The downside
>> is of course that the VM will be paused while management is not running,
>> but that is acceptable if it only happens with explicit "-device pvpanic".
>>
>> Upon learning of a panic, management (if configured to do so) can pick a
>> variety of behaviors: leave the VM paused, reset it, destroy it.  In
>> addition to all of these behaviors, it is possible dumping the VM core
>> from the host.
>>
>> However, right now, the panicked state is irreversible, and can only be
>> exited by resetting the machine.  This means that any policy decision
>> is entirely in the hands of the host.  In particular there is no way to
>> use the "reboot on panic" option together with pvpanic.
>>
>> This patch makes the panicked state reversible (and removes various
>> workarounds that were there because of the state being irreversible).
>> With this change, management has a wider set of possible policies: it
>> can just log the crash and leave policy to the guest, it can leave the
>> VM paused.  In particular, the "log the crash and continue" is implemented
>> simply by sending a "cont" as soon as management learns about the panic.
>> Management could also implement the "irreversible paused state" itself.
>> And again, all such actions can be coupled with dumping the VM core.
>>
>> Unfortunately we cannot change the behavior of 1.6.0.  Thus, even if
>> it uses "-device pvpanic", management should check for "cont" failures.
>> If "cont" fails, management can then log that the VM remained paused
>> and urge the administrator to update QEMU.
>>
>> I suggest that this patch be included in an 1.6.1 release as soon as
>> possible, and perhaps in the 1.5 branch too.
>>
>> Cc: qemu-stable@nongnu.org
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> 
> OK this does sound reasonable, but it looks like current behaviour
> was intentional

Yes, it was intentional and it also sounded reasonable at the time.  The
gdbstub.c hack definitely should have raised a warning sign, though.

I suspect it was done this way simply because Xen has a "crashed" state
that behaves exactly like that, with policy determined exclusively by
the host.  And it has the same problems, in fact, even though I never
heard anyone complain about it for some weird reason...

With this patch, the same thing can be implemented at the libvirt level,
and the GUEST_PANICKED runstate now matches the semantics of watchdogs
too, so this solution is not only easier to use, but also more
consistent with the rest of QEMU.

Paolo

, so I wonder why was it put in place.
> Any idea?
> 
>> ---
>>  gdbstub.c | 3 ---
>>  vl.c      | 6 ++----
>>  2 files changed, 2 insertions(+), 7 deletions(-)
>>
>> diff --git a/gdbstub.c b/gdbstub.c
>> index 35ca7c2..747e67d 100644
>> --- a/gdbstub.c
>> +++ b/gdbstub.c
>> @@ -372,9 +372,6 @@ static inline void gdb_continue(GDBState *s)
>>  #ifdef CONFIG_USER_ONLY
>>      s->running_state = 1;
>>  #else
>> -    if (runstate_check(RUN_STATE_GUEST_PANICKED)) {
>> -        runstate_set(RUN_STATE_DEBUG);
>> -    }
>>      if (!runstate_needs_reset()) {
>>          vm_start();
>>      }
>> diff --git a/vl.c b/vl.c
>> index 25b8f2f..818d99e 100644
>> --- a/vl.c
>> +++ b/vl.c
>> @@ -637,9 +637,8 @@ static const RunStateTransition runstate_transitions_def[] = {
>>      { RUN_STATE_WATCHDOG, RUN_STATE_RUNNING },
>>      { RUN_STATE_WATCHDOG, RUN_STATE_FINISH_MIGRATE },
>>  
>> -    { RUN_STATE_GUEST_PANICKED, RUN_STATE_PAUSED },
>> +    { RUN_STATE_GUEST_PANICKED, RUN_STATE_RUNNING },
>>      { RUN_STATE_GUEST_PANICKED, RUN_STATE_FINISH_MIGRATE },
>> -    { RUN_STATE_GUEST_PANICKED, RUN_STATE_DEBUG },
>>  
>>      { RUN_STATE_MAX, RUN_STATE_MAX },
>>  };
>> @@ -685,8 +684,7 @@ int runstate_is_running(void)
>>  bool runstate_needs_reset(void)
>>  {
>>      return runstate_check(RUN_STATE_INTERNAL_ERROR) ||
>> -        runstate_check(RUN_STATE_SHUTDOWN) ||
>> -        runstate_check(RUN_STATE_GUEST_PANICKED);
>> +        runstate_check(RUN_STATE_SHUTDOWN);
>>  }
>>  
>>  StatusInfo *qmp_query_status(Error **errp)
>> -- 
>> 1.8.3.1
> 
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH] vl: allow "cont" from panicked state
  2013-08-21 12:01 [Qemu-devel] [PATCH] vl: allow "cont" from panicked state Paolo Bonzini
  2013-08-21 12:42 ` Laszlo Ersek
  2013-08-21 13:30 ` Michael S. Tsirkin
@ 2013-08-21 14:11 ` Luiz Capitulino
  2013-08-21 15:23 ` Eric Blake
  3 siblings, 0 replies; 22+ messages in thread
From: Luiz Capitulino @ 2013-08-21 14:11 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: pkrempa, marcel.a, libvir-list, mst, qemu-devel, qemu-stable,
	rhod, kraxel, anthony, hutao, lersek, afaerber

On Wed, 21 Aug 2013 14:01:17 +0200
Paolo Bonzini <pbonzini@redhat.com> wrote:

> After reporting the GUEST_PANICKED monitor event, QEMU stops the VM.
> The reason for this is that events are edge-triggered, and can be lost if
> management dies at the wrong time.  Stopping a panicked VM lets management
> know of a panic even if it has crashed; management can learn about the
> panic when it restarts and queries running QEMU processes.  The downside
> is of course that the VM will be paused while management is not running,
> but that is acceptable if it only happens with explicit "-device pvpanic".
> 
> Upon learning of a panic, management (if configured to do so) can pick a
> variety of behaviors: leave the VM paused, reset it, destroy it.  In
> addition to all of these behaviors, it is possible dumping the VM core
> from the host.
> 
> However, right now, the panicked state is irreversible, and can only be
> exited by resetting the machine.  This means that any policy decision
> is entirely in the hands of the host.  In particular there is no way to
> use the "reboot on panic" option together with pvpanic.
> 
> This patch makes the panicked state reversible (and removes various
> workarounds that were there because of the state being irreversible).
> With this change, management has a wider set of possible policies: it
> can just log the crash and leave policy to the guest, it can leave the
> VM paused.  In particular, the "log the crash and continue" is implemented
> simply by sending a "cont" as soon as management learns about the panic.
> Management could also implement the "irreversible paused state" itself.
> And again, all such actions can be coupled with dumping the VM core.
> 
> Unfortunately we cannot change the behavior of 1.6.0.  Thus, even if
> it uses "-device pvpanic", management should check for "cont" failures.
> If "cont" fails, management can then log that the VM remained paused
> and urge the administrator to update QEMU.
> 
> I suggest that this patch be included in an 1.6.1 release as soon as
> possible, and perhaps in the 1.5 branch too.

Looks good to me:

Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>

> 
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  gdbstub.c | 3 ---
>  vl.c      | 6 ++----
>  2 files changed, 2 insertions(+), 7 deletions(-)
> 
> diff --git a/gdbstub.c b/gdbstub.c
> index 35ca7c2..747e67d 100644
> --- a/gdbstub.c
> +++ b/gdbstub.c
> @@ -372,9 +372,6 @@ static inline void gdb_continue(GDBState *s)
>  #ifdef CONFIG_USER_ONLY
>      s->running_state = 1;
>  #else
> -    if (runstate_check(RUN_STATE_GUEST_PANICKED)) {
> -        runstate_set(RUN_STATE_DEBUG);
> -    }
>      if (!runstate_needs_reset()) {
>          vm_start();
>      }
> diff --git a/vl.c b/vl.c
> index 25b8f2f..818d99e 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -637,9 +637,8 @@ static const RunStateTransition runstate_transitions_def[] = {
>      { RUN_STATE_WATCHDOG, RUN_STATE_RUNNING },
>      { RUN_STATE_WATCHDOG, RUN_STATE_FINISH_MIGRATE },
>  
> -    { RUN_STATE_GUEST_PANICKED, RUN_STATE_PAUSED },
> +    { RUN_STATE_GUEST_PANICKED, RUN_STATE_RUNNING },
>      { RUN_STATE_GUEST_PANICKED, RUN_STATE_FINISH_MIGRATE },
> -    { RUN_STATE_GUEST_PANICKED, RUN_STATE_DEBUG },
>  
>      { RUN_STATE_MAX, RUN_STATE_MAX },
>  };
> @@ -685,8 +684,7 @@ int runstate_is_running(void)
>  bool runstate_needs_reset(void)
>  {
>      return runstate_check(RUN_STATE_INTERNAL_ERROR) ||
> -        runstate_check(RUN_STATE_SHUTDOWN) ||
> -        runstate_check(RUN_STATE_GUEST_PANICKED);
> +        runstate_check(RUN_STATE_SHUTDOWN);
>  }
>  
>  StatusInfo *qmp_query_status(Error **errp)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH] vl: allow "cont" from panicked state
  2013-08-21 12:43   ` Paolo Bonzini
@ 2013-08-21 14:17     ` Luiz Capitulino
  2013-08-21 14:30       ` Michael S. Tsirkin
  0 siblings, 1 reply; 22+ messages in thread
From: Luiz Capitulino @ 2013-08-21 14:17 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: pkrempa, marcel.a, libvir-list, hutao, mst, qemu-stable,
	qemu-devel, rhod, kraxel, anthony, Laszlo Ersek, afaerber

On Wed, 21 Aug 2013 14:43:11 +0200
Paolo Bonzini <pbonzini@redhat.com> wrote:

> Il 21/08/2013 14:42, Laszlo Ersek ha scritto:
> > (*) Hm I think I understand why. main_loop_should_exit(), when a reset
> > was requested *and* runstate_needs_reset() evaluated to true, used to
> > set the runstate to PAUSED -- I guess temporarily.
> 
> Yes, this is the code that does the PANICKED -> PAUSED transition:
> 
>         if (runstate_needs_reset()) {
>             runstate_set(RUN_STATE_PAUSED);
>         }
> 
> This is to move the system out of a runstate that needs_reset(), and
> make the subsequent "cont" work instead of hitting this:
> 
>     if (runstate_needs_reset()) {
>         error_set(errp, QERR_RESET_REQUIRED);
>         return;
>     }

Yes. For those states issuing 'cont' won't put the guest to run again,
so you're required to reset the guest first.

I think the same reasoning went behind the PANICKED state, and for most
cases it's going to be disastrous to put the guest to run again, but
I can understand that this is up user/mngt to decide this, not QEMU.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH] vl: allow "cont" from panicked state
  2013-08-21 14:17     ` Luiz Capitulino
@ 2013-08-21 14:30       ` Michael S. Tsirkin
  2013-08-21 14:37         ` Paolo Bonzini
  0 siblings, 1 reply; 22+ messages in thread
From: Michael S. Tsirkin @ 2013-08-21 14:30 UTC (permalink / raw)
  To: Luiz Capitulino
  Cc: pkrempa, marcel.a, libvir-list, hutao, qemu-stable, qemu-devel,
	rhod, kraxel, anthony, Paolo Bonzini, Laszlo Ersek, afaerber

On Wed, Aug 21, 2013 at 10:17:49AM -0400, Luiz Capitulino wrote:
> On Wed, 21 Aug 2013 14:43:11 +0200
> Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
> > Il 21/08/2013 14:42, Laszlo Ersek ha scritto:
> > > (*) Hm I think I understand why. main_loop_should_exit(), when a reset
> > > was requested *and* runstate_needs_reset() evaluated to true, used to
> > > set the runstate to PAUSED -- I guess temporarily.
> > 
> > Yes, this is the code that does the PANICKED -> PAUSED transition:
> > 
> >         if (runstate_needs_reset()) {
> >             runstate_set(RUN_STATE_PAUSED);
> >         }
> > 
> > This is to move the system out of a runstate that needs_reset(), and
> > make the subsequent "cont" work instead of hitting this:
> > 
> >     if (runstate_needs_reset()) {
> >         error_set(errp, QERR_RESET_REQUIRED);
> >         return;
> >     }
> 
> Yes. For those states issuing 'cont' won't put the guest to run again,
> so you're required to reset the guest first.
> 
> I think the same reasoning went behind the PANICKED state, and for most
> cases it's going to be disastrous to put the guest to run again,

Why will it? It will most likely just call halt a bit later.

> but
> I can understand that this is up user/mngt to decide this, not QEMU.

I don't have a problem with this patch as such, so

Acked-by: Michael S. Tsirkin <mst@redhat.com>

though I'm still not really sure why do we
want to block guest immediately on panic.
Why not let it call halt a bit later?


-- 
MST

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH] vl: allow "cont" from panicked state
  2013-08-21 14:30       ` Michael S. Tsirkin
@ 2013-08-21 14:37         ` Paolo Bonzini
  2013-08-21 14:58           ` Michael S. Tsirkin
  0 siblings, 1 reply; 22+ messages in thread
From: Paolo Bonzini @ 2013-08-21 14:37 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: qemu-devel, pkrempa, marcel.a, libvir-list, hutao, qemu-stable,
	Luiz Capitulino, rhod, kraxel, anthony, Laszlo Ersek, afaerber

Il 21/08/2013 16:30, Michael S. Tsirkin ha scritto:
>> I think the same reasoning went behind the PANICKED state, and for most
>> cases it's going to be disastrous to put the guest to run again,
> 
> Why will it? It will most likely just call halt a bit later.

I agree.

>> but I can understand that this is up user/mngt to decide this, not QEMU.
> 
> I don't have a problem with this patch as such, so
> 
> Acked-by: Michael S. Tsirkin <mst@redhat.com>
> 
> though I'm still not really sure why do we
> want to block guest immediately on panic.
> Why not let it call halt a bit later?

To make sure the panic is detected, and action taken, in the host even
if management has crashed at the time.  For example, even if you have
reboot-on-panic active, management has time to take a core dump of the
paused guest _before_ the reboot.

Paolo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH] vl: allow "cont" from panicked state
  2013-08-21 14:37         ` Paolo Bonzini
@ 2013-08-21 14:58           ` Michael S. Tsirkin
  2013-08-21 15:07             ` Paolo Bonzini
  0 siblings, 1 reply; 22+ messages in thread
From: Michael S. Tsirkin @ 2013-08-21 14:58 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: pkrempa, marcel.a, libvir-list, hutao, qemu-stable, qemu-devel,
	rhod, kraxel, anthony, Luiz Capitulino, Laszlo Ersek, afaerber

On Wed, Aug 21, 2013 at 04:37:56PM +0200, Paolo Bonzini wrote:
> Il 21/08/2013 16:30, Michael S. Tsirkin ha scritto:
> >> I think the same reasoning went behind the PANICKED state, and for most
> >> cases it's going to be disastrous to put the guest to run again,
> > 
> > Why will it? It will most likely just call halt a bit later.
> 
> I agree.
> 
> >> but I can understand that this is up user/mngt to decide this, not QEMU.
> > 
> > I don't have a problem with this patch as such, so
> > 
> > Acked-by: Michael S. Tsirkin <mst@redhat.com>
> > 
> > though I'm still not really sure why do we
> > want to block guest immediately on panic.
> > Why not let it call halt a bit later?
> 
> To make sure the panic is detected, and action taken, in the host even
> if management has crashed at the time.

I'm not sure I get the reference to management crashing - we just
need to maintain "panicked" state to make sure info is not lost ...

> For example, even if you have
> reboot-on-panic active, management has time to take a core dump of the
> paused guest _before_ the reboot.
> 
> Paolo

but this sounds like a good reason to support synchronous panic events.

-- 
MST

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH] vl: allow "cont" from panicked state
  2013-08-21 14:58           ` Michael S. Tsirkin
@ 2013-08-21 15:07             ` Paolo Bonzini
  0 siblings, 0 replies; 22+ messages in thread
From: Paolo Bonzini @ 2013-08-21 15:07 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: pkrempa, marcel.a, libvir-list, hutao, qemu-stable, qemu-devel,
	rhod, kraxel, anthony, Luiz Capitulino, Laszlo Ersek, afaerber

Il 21/08/2013 16:58, Michael S. Tsirkin ha scritto:
> On Wed, Aug 21, 2013 at 04:37:56PM +0200, Paolo Bonzini wrote:
>> Il 21/08/2013 16:30, Michael S. Tsirkin ha scritto:
>>>> I think the same reasoning went behind the PANICKED state, and for most
>>>> cases it's going to be disastrous to put the guest to run again,
>>>
>>> Why will it? It will most likely just call halt a bit later.
>>
>> I agree.
>>
>>>> but I can understand that this is up user/mngt to decide this, not QEMU.
>>>
>>> I don't have a problem with this patch as such, so
>>>
>>> Acked-by: Michael S. Tsirkin <mst@redhat.com>
>>>
>>> though I'm still not really sure why do we
>>> want to block guest immediately on panic.
>>> Why not let it call halt a bit later?
>>
>> To make sure the panic is detected, and action taken, in the host even
>> if management has crashed at the time.
> 
> I'm not sure I get the reference to management crashing - we just
> need to maintain "panicked" state to make sure info is not lost ...

Yes, but that would be additional burden (a new "info" command to
retrieve the exact time of panicking, or something like that).  Since
synchronous panic events are useful anyway, and crashed management is
rare, it's simpler to go the synchronous way.

Paolo

>> For example, even if you have
>> reboot-on-panic active, management has time to take a core dump of the
>> paused guest _before_ the reboot.
>>
>> Paolo
> 
> but this sounds like a good reason to support synchronous panic events.
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH] vl: allow "cont" from panicked state
  2013-08-21 12:01 [Qemu-devel] [PATCH] vl: allow "cont" from panicked state Paolo Bonzini
                   ` (2 preceding siblings ...)
  2013-08-21 14:11 ` Luiz Capitulino
@ 2013-08-21 15:23 ` Eric Blake
  2013-08-21 15:32   ` Paolo Bonzini
  3 siblings, 1 reply; 22+ messages in thread
From: Eric Blake @ 2013-08-21 15:23 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: pkrempa, marcel.a, libvir-list, hutao, qemu-stable, mst,
	qemu-devel, lcapitulino, rhod, kraxel, anthony, lersek, afaerber

[-- Attachment #1: Type: text/plain, Size: 4458 bytes --]

On 08/21/2013 06:01 AM, Paolo Bonzini wrote:
> After reporting the GUEST_PANICKED monitor event, QEMU stops the VM.
> The reason for this is that events are edge-triggered, and can be lost if
> management dies at the wrong time.  Stopping a panicked VM lets management
> know of a panic even if it has crashed; management can learn about the
> panic when it restarts and queries running QEMU processes.  The downside
> is of course that the VM will be paused while management is not running,
> but that is acceptable if it only happens with explicit "-device pvpanic".

Agreed - the key point is that by having a command line option to opt in
to panic handling, then libvirt can decide whether panics should pause
or auto-resume based on its <on_crash> settings being mapped to
appropriate command lines.

> 
> Upon learning of a panic, management (if configured to do so) can pick a
> variety of behaviors: leave the VM paused, reset it, destroy it.  In
> addition to all of these behaviors, it is possible dumping the VM core
> from the host.

s/possible dumping/possible to dump/

and yes, libvirt wants to do just that, as one of its <on_crash>
mappings, since it could do the same for Xen.

> 
> However, right now, the panicked state is irreversible, and can only be
> exited by resetting the machine.  This means that any policy decision
> is entirely in the hands of the host.  In particular there is no way to
> use the "reboot on panic" option together with pvpanic.
> 
> This patch makes the panicked state reversible (and removes various
> workarounds that were there because of the state being irreversible).
> With this change, management has a wider set of possible policies: it
> can just log the crash and leave policy to the guest, it can leave the
> VM paused.  In particular, the "log the crash and continue" is implemented
> simply by sending a "cont" as soon as management learns about the panic.
> Management could also implement the "irreversible paused state" itself.
> And again, all such actions can be coupled with dumping the VM core.

Yes, this makes sense.

> 
> Unfortunately we cannot change the behavior of 1.6.0.  Thus, even if
> it uses "-device pvpanic", management should check for "cont" failures.
> If "cont" fails, management can then log that the VM remained paused
> and urge the administrator to update QEMU.

Is that the best we can do?  Is there any sort of QMP introspection that
libvirt can do, where we can know UP FRONT what level of panic support
is provided by the qemu binary and the machine type being run in that
binary?  I'm afraid we've created a complicated mess of what options
work when, and I'm not looking forward to what it will take to encode
all the correct workarounds into libvirt.  Ideally, I'd like a one-shot
question: is qemu new enough to sanely support reversible '-device
pvpanic'? If so, honor <on_crash> settings, if not, reject attempts to
use any <on_crash> setting other than the default that matches qemu 1.4
behavior - but I might be persuaded to also support qemu 1.5/1.6
behaviors if they are easy enough to detect and work with; there's also
the question that the behavior is machine-type dependent (-M pc-1.5
behaves differently than -M pc-1.6).

> 
> I suggest that this patch be included in an 1.6.1 release as soon as
> possible, and perhaps in the 1.5 branch too.
> 
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  gdbstub.c | 3 ---
>  vl.c      | 6 ++----
>  2 files changed, 2 insertions(+), 7 deletions(-)

Reviewed-by: Eric Blake <eblake@redhat.com>

/me why oh why did we rush such a half-baked builtin design into qemu
1.5 again?

> +++ b/vl.c
> @@ -637,9 +637,8 @@ static const RunStateTransition runstate_transitions_def[] = {
>      { RUN_STATE_WATCHDOG, RUN_STATE_RUNNING },
>      { RUN_STATE_WATCHDOG, RUN_STATE_FINISH_MIGRATE },
>  
> -    { RUN_STATE_GUEST_PANICKED, RUN_STATE_PAUSED },
> +    { RUN_STATE_GUEST_PANICKED, RUN_STATE_RUNNING },

Is 'cont' the only viable way to escape PANICKED, or is it also
reasonable to support 'stop' as a way to transition from PANICKED to
PAUSED?  That is, management may want to make the state reversible but
still leave the guest paused, so this patch may be incomplete.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH] vl: allow "cont" from panicked state
  2013-08-21 15:23 ` Eric Blake
@ 2013-08-21 15:32   ` Paolo Bonzini
  2013-08-21 15:44     ` Michael S. Tsirkin
  2013-08-22  8:38     ` Laszlo Ersek
  0 siblings, 2 replies; 22+ messages in thread
From: Paolo Bonzini @ 2013-08-21 15:32 UTC (permalink / raw)
  To: Eric Blake
  Cc: pkrempa, marcel.a, libvir-list, hutao, qemu-stable, mst,
	qemu-devel, lcapitulino, rhod, kraxel, anthony, lersek, afaerber

Il 21/08/2013 17:23, Eric Blake ha scritto:
>> Upon learning of a panic, management (if configured to do so) can pick a
>> variety of behaviors: leave the VM paused, reset it, destroy it.  In
>> addition to all of these behaviors, it is possible dumping the VM core
>> from the host.
> 
> s/possible dumping/possible to dump/
> 
> and yes, libvirt wants to do just that, as one of its <on_crash>
> mappings, since it could do the same for Xen.
> 
>>
>> However, right now, the panicked state is irreversible, and can only be
>> exited by resetting the machine.  This means that any policy decision
>> is entirely in the hands of the host.  In particular there is no way to
>> use the "reboot on panic" option together with pvpanic.
>>
>> This patch makes the panicked state reversible (and removes various
>> workarounds that were there because of the state being irreversible).
>> With this change, management has a wider set of possible policies: it
>> can just log the crash and leave policy to the guest, it can leave the
>> VM paused.  In particular, the "log the crash and continue" is implemented
>> simply by sending a "cont" as soon as management learns about the panic.
>> Management could also implement the "irreversible paused state" itself.
>> And again, all such actions can be coupled with dumping the VM core.
> 
> Yes, this makes sense.
> 
>>
>> Unfortunately we cannot change the behavior of 1.6.0.  Thus, even if
>> it uses "-device pvpanic", management should check for "cont" failures.
>> If "cont" fails, management can then log that the VM remained paused
>> and urge the administrator to update QEMU.
> 
> Is that the best we can do?  Is there any sort of QMP introspection that
> libvirt can do, where we can know UP FRONT what level of panic support
> is provided by the qemu binary and the machine type being run in that
> binary?

No, this is not possible unfortunately.  The only possibility that comes
to mind would be to rename the pvpanic device, e.g. to "isa-pvpanic",
and forget about "-device pvpanic" on 1.6.x.  A hack, I know.

To support 1.5, libvirt should simply be ready to react to unanticipated
GUEST_PANICKED events.  reboot-on-panic will simply be broken for 1.5
and Linux 3.10+ guests. :(

>> +++ b/vl.c
>> @@ -637,9 +637,8 @@ static const RunStateTransition runstate_transitions_def[] = {
>>      { RUN_STATE_WATCHDOG, RUN_STATE_RUNNING },
>>      { RUN_STATE_WATCHDOG, RUN_STATE_FINISH_MIGRATE },
>>  
>> -    { RUN_STATE_GUEST_PANICKED, RUN_STATE_PAUSED },
>> +    { RUN_STATE_GUEST_PANICKED, RUN_STATE_RUNNING },
> 
> Is 'cont' the only viable way to escape PANICKED, or is it also
> reasonable to support 'stop' as a way to transition from PANICKED to
> PAUSED?  That is, management may want to make the state reversible but
> still leave the guest paused, so this patch may be incomplete.

No, there is no way to move from PANICKED to PAUSED.  Libvirt has its
own statuses (PAUSED, CRASHED etc.) and substatuses.  You don't really
care about the QEMU state: both the PAUSED_PANICKED and CRASHED_PANICKED
substatuses map to QEMU's GUEST_PANICKED state.  Simply, libvirt will
not allow a "virsh resume" for <on_crash>preserve</on_crash>, and will
allow it for a hypothetical new <on_crash>pause</on_crash> element.

BTW, any chance "coredump-destroy" and "coredump-restart" can be
preserved just for backwards compatibility, and a new coredump='yes/no'
attribute introduced instead?  Because coredump-pause and
coredump-preserve would make just as much sense.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH] vl: allow "cont" from panicked state
  2013-08-21 15:32   ` Paolo Bonzini
@ 2013-08-21 15:44     ` Michael S. Tsirkin
  2013-08-22  8:38     ` Laszlo Ersek
  1 sibling, 0 replies; 22+ messages in thread
From: Michael S. Tsirkin @ 2013-08-21 15:44 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: pkrempa, marcel.a, libvir-list, qemu-stable, qemu-devel,
	lcapitulino, rhod, kraxel, anthony, hutao, lersek, afaerber

On Wed, Aug 21, 2013 at 05:32:27PM +0200, Paolo Bonzini wrote:
> Il 21/08/2013 17:23, Eric Blake ha scritto:
> >> Upon learning of a panic, management (if configured to do so) can pick a
> >> variety of behaviors: leave the VM paused, reset it, destroy it.  In
> >> addition to all of these behaviors, it is possible dumping the VM core
> >> from the host.
> > 
> > s/possible dumping/possible to dump/
> > 
> > and yes, libvirt wants to do just that, as one of its <on_crash>
> > mappings, since it could do the same for Xen.
> > 
> >>
> >> However, right now, the panicked state is irreversible, and can only be
> >> exited by resetting the machine.  This means that any policy decision
> >> is entirely in the hands of the host.  In particular there is no way to
> >> use the "reboot on panic" option together with pvpanic.
> >>
> >> This patch makes the panicked state reversible (and removes various
> >> workarounds that were there because of the state being irreversible).
> >> With this change, management has a wider set of possible policies: it
> >> can just log the crash and leave policy to the guest, it can leave the
> >> VM paused.  In particular, the "log the crash and continue" is implemented
> >> simply by sending a "cont" as soon as management learns about the panic.
> >> Management could also implement the "irreversible paused state" itself.
> >> And again, all such actions can be coupled with dumping the VM core.
> > 
> > Yes, this makes sense.
> > 
> >>
> >> Unfortunately we cannot change the behavior of 1.6.0.  Thus, even if
> >> it uses "-device pvpanic", management should check for "cont" failures.
> >> If "cont" fails, management can then log that the VM remained paused
> >> and urge the administrator to update QEMU.
> > 
> > Is that the best we can do?  Is there any sort of QMP introspection that
> > libvirt can do, where we can know UP FRONT what level of panic support
> > is provided by the qemu binary and the machine type being run in that
> > binary?
> 
> No, this is not possible unfortunately.  The only possibility that comes
> to mind would be to rename the pvpanic device, e.g. to "isa-pvpanic",
> and forget about "-device pvpanic" on 1.6.x.  A hack, I know.
> 
> To support 1.5, libvirt should simply be ready to react to unanticipated
> GUEST_PANICKED events.  reboot-on-panic will simply be broken for 1.5
> and Linux 3.10+ guests. :(

Let's just fix the bugs in 1.6.X.
I don't think libvirt needs to work around all qemu bugs.

For 1.5.X it might be possible to backport -device pvpanic there.
We need to make sure cross-version migration works.

> >> +++ b/vl.c
> >> @@ -637,9 +637,8 @@ static const RunStateTransition runstate_transitions_def[] = {
> >>      { RUN_STATE_WATCHDOG, RUN_STATE_RUNNING },
> >>      { RUN_STATE_WATCHDOG, RUN_STATE_FINISH_MIGRATE },
> >>  
> >> -    { RUN_STATE_GUEST_PANICKED, RUN_STATE_PAUSED },
> >> +    { RUN_STATE_GUEST_PANICKED, RUN_STATE_RUNNING },
> > 
> > Is 'cont' the only viable way to escape PANICKED, or is it also
> > reasonable to support 'stop' as a way to transition from PANICKED to
> > PAUSED?  That is, management may want to make the state reversible but
> > still leave the guest paused, so this patch may be incomplete.
> 
> No, there is no way to move from PANICKED to PAUSED.  Libvirt has its
> own statuses (PAUSED, CRASHED etc.) and substatuses.  You don't really
> care about the QEMU state: both the PAUSED_PANICKED and CRASHED_PANICKED
> substatuses map to QEMU's GUEST_PANICKED state.  Simply, libvirt will
> not allow a "virsh resume" for <on_crash>preserve</on_crash>, and will
> allow it for a hypothetical new <on_crash>pause</on_crash> element.
> 
> BTW, any chance "coredump-destroy" and "coredump-restart" can be
> preserved just for backwards compatibility, and a new coredump='yes/no'
> attribute introduced instead?  Because coredump-pause and
> coredump-preserve would make just as much sense.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH] vl: allow "cont" from panicked state
  2013-08-21 15:32   ` Paolo Bonzini
  2013-08-21 15:44     ` Michael S. Tsirkin
@ 2013-08-22  8:38     ` Laszlo Ersek
  2013-08-22  9:19       ` Paolo Bonzini
  1 sibling, 1 reply; 22+ messages in thread
From: Laszlo Ersek @ 2013-08-22  8:38 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: pkrempa, marcel.a, libvir-list, hutao, qemu-stable, mst,
	qemu-devel, lcapitulino, rhod, kraxel, anthony, afaerber

On 08/21/13 17:32, Paolo Bonzini wrote:

> To support 1.5, libvirt should simply be ready to react to unanticipated
> GUEST_PANICKED events.  reboot-on-panic will simply be broken for 1.5
> and Linux 3.10+ guests. :(

I'm probably misunderstanding the discussion, but it might be possible
to disable pvpanic even in 1.5 from the host side, with the following hack:

  -global pvpanic.ioport=0

In qemu, this will either configure a working pvpanic device on ioport
0, or the pvpanic device will be genuinely broken. At least it doesn't
(obviously) break other stuff (in v1.5.2):

(qemu) info mtree
I/O
0000000000000000-000000000000ffff (prio 0, RW): io
  0000000000000000-0000000000000000 (prio 0, RW): pvpanic
  0000000000000000-0000000000000007 (prio 0, RW): dma-chan

(qemu) info qtree
bus: main-system-bus
  dev: i440FX-pcihost, id ""
    bus: pci.0
      dev: PIIX3, id ""
        bus: isa.0
          dev: pvpanic, id ""
            ioport = 0

Either way, the "etc/pvpanic-port" fw_cfg file will contain 0, and
SeaBIOS will interpret it as "no pvpanic device". It will report the
same to the Linux guest too (_STA will return 0 for QEMU0001;
pvpanic_add() --> -ENODEV). Thus, no pvpanic notifier should be
registered, and reboot-on-panic should be reachable in the guest.

A horrible hack, certainly.

Laszlo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH] vl: allow "cont" from panicked state
  2013-08-22  8:38     ` Laszlo Ersek
@ 2013-08-22  9:19       ` Paolo Bonzini
  2013-08-22  9:37         ` Michael S. Tsirkin
  2013-08-22 10:34         ` Laszlo Ersek
  0 siblings, 2 replies; 22+ messages in thread
From: Paolo Bonzini @ 2013-08-22  9:19 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: pkrempa, marcel.a, libvir-list, hutao, qemu-stable, mst,
	qemu-devel, lcapitulino, rhod, kraxel, anthony, afaerber

Il 22/08/2013 10:38, Laszlo Ersek ha scritto:
>> > To support 1.5, libvirt should simply be ready to react to unanticipated
>> > GUEST_PANICKED events.  reboot-on-panic will simply be broken for 1.5
>> > and Linux 3.10+ guests. :(
> I'm probably misunderstanding the discussion, but it might be possible
> to disable pvpanic even in 1.5 from the host side, with the following hack:
> 
>   -global pvpanic.ioport=0
> 
> In qemu, this will either configure a working pvpanic device on ioport
> 0, or the pvpanic device will be genuinely broken. At least it doesn't
> (obviously) break other stuff (in v1.5.2):
> 
> (qemu) info mtree
> I/O
> 0000000000000000-000000000000ffff (prio 0, RW): io
>   0000000000000000-0000000000000000 (prio 0, RW): pvpanic
>   0000000000000000-0000000000000007 (prio 0, RW): dma-chan

No, you're not misunderstanding the discussion.

Depending on the priorities of the pvpanic and legacy-DMA regions, it
would break DMA channel 0.  Channel 0 is (was) used for DRAM refresh, so
it should not have any visible effect.  However, it may not be entirely
disabling pvpanic, just making it mostly invisible.

Paolo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH] vl: allow "cont" from panicked state
  2013-08-22  9:19       ` Paolo Bonzini
@ 2013-08-22  9:37         ` Michael S. Tsirkin
  2013-08-22  9:52           ` Paolo Bonzini
  2013-08-22 10:34         ` Laszlo Ersek
  1 sibling, 1 reply; 22+ messages in thread
From: Michael S. Tsirkin @ 2013-08-22  9:37 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: pkrempa, marcel.a, libvir-list, Laszlo Ersek, qemu-stable,
	qemu-devel, lcapitulino, rhod, kraxel, anthony, hutao, afaerber

On Thu, Aug 22, 2013 at 11:19:44AM +0200, Paolo Bonzini wrote:
> Il 22/08/2013 10:38, Laszlo Ersek ha scritto:
> >> > To support 1.5, libvirt should simply be ready to react to unanticipated
> >> > GUEST_PANICKED events.  reboot-on-panic will simply be broken for 1.5
> >> > and Linux 3.10+ guests. :(
> > I'm probably misunderstanding the discussion, but it might be possible
> > to disable pvpanic even in 1.5 from the host side, with the following hack:
> > 
> >   -global pvpanic.ioport=0
> > 
> > In qemu, this will either configure a working pvpanic device on ioport
> > 0, or the pvpanic device will be genuinely broken. At least it doesn't
> > (obviously) break other stuff (in v1.5.2):
> > 
> > (qemu) info mtree
> > I/O
> > 0000000000000000-000000000000ffff (prio 0, RW): io
> >   0000000000000000-0000000000000000 (prio 0, RW): pvpanic
> >   0000000000000000-0000000000000007 (prio 0, RW): dma-chan
> 
> No, you're not misunderstanding the discussion.
> 
> Depending on the priorities of the pvpanic and legacy-DMA regions, it
> would break DMA channel 0.  Channel 0 is (was) used for DRAM refresh, so
> it should not have any visible effect.  However, it may not be entirely
> disabling pvpanic, just making it mostly invisible.
> 
> Paolo

Ugh.

And now that Paolo pointed out that nothing terrible
happens even when migrating from host with pvpanic
enabled to host with pvpanic disabled, I'm inclined
to think we should just disable pvpanic in 1.5.X.

Thoughts?

-- 
MST

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH] vl: allow "cont" from panicked state
  2013-08-22  9:37         ` Michael S. Tsirkin
@ 2013-08-22  9:52           ` Paolo Bonzini
  0 siblings, 0 replies; 22+ messages in thread
From: Paolo Bonzini @ 2013-08-22  9:52 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: pkrempa, marcel.a, libvir-list, hutao, qemu-devel, qemu-stable,
	rhod, kraxel, anthony, lcapitulino, Laszlo Ersek, afaerber

Il 22/08/2013 11:37, Michael S. Tsirkin ha scritto:
>> No, you're not misunderstanding the discussion.
>>
>> Depending on the priorities of the pvpanic and legacy-DMA regions, it
>> would break DMA channel 0.  Channel 0 is (was) used for DRAM refresh, so
>> it should not have any visible effect.  However, it may not be entirely
>> disabling pvpanic, just making it mostly invisible.
>>
>> Paolo
> 
> Ugh.
> 
> And now that Paolo pointed out that nothing terrible
> happens even when migrating from host with pvpanic
> enabled to host with pvpanic disabled, I'm inclined
> to think we should just disable pvpanic in 1.5.X.
> 
> Thoughts?

You'd obviously have no objection from me.

Paolo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH] vl: allow "cont" from panicked state
  2013-08-22  9:19       ` Paolo Bonzini
  2013-08-22  9:37         ` Michael S. Tsirkin
@ 2013-08-22 10:34         ` Laszlo Ersek
  2013-08-22 10:36           ` Laszlo Ersek
  2013-08-22 11:35           ` Paolo Bonzini
  1 sibling, 2 replies; 22+ messages in thread
From: Laszlo Ersek @ 2013-08-22 10:34 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: pkrempa, marcel.a, libvir-list, hutao, qemu-stable, mst,
	qemu-devel, lcapitulino, rhod, kraxel, anthony, afaerber

On 08/22/13 11:19, Paolo Bonzini wrote:
> Il 22/08/2013 10:38, Laszlo Ersek ha scritto:
>>>> To support 1.5, libvirt should simply be ready to react to unanticipated
>>>> GUEST_PANICKED events.  reboot-on-panic will simply be broken for 1.5
>>>> and Linux 3.10+ guests. :(
>> I'm probably misunderstanding the discussion, but it might be possible
>> to disable pvpanic even in 1.5 from the host side, with the following hack:
>>
>>   -global pvpanic.ioport=0
>>
>> In qemu, this will either configure a working pvpanic device on ioport
>> 0, or the pvpanic device will be genuinely broken. At least it doesn't
>> (obviously) break other stuff (in v1.5.2):
>>
>> (qemu) info mtree
>> I/O
>> 0000000000000000-000000000000ffff (prio 0, RW): io
>>   0000000000000000-0000000000000000 (prio 0, RW): pvpanic
>>   0000000000000000-0000000000000007 (prio 0, RW): dma-chan
> 
> No, you're not misunderstanding the discussion.
> 
> Depending on the priorities of the pvpanic and legacy-DMA regions, it
> would break DMA channel 0. 

<academic>

I think before priority comes into the picture, the access size would
matter first, no?

(I think I'm recalling this from the 0xCF9 reset control register, which
falls into the [0xCF8..0xCFA] range.)

Unless ioport 0 is accessed with width 1 for dma-chan purposes, I think
such an access would be unique to pvpanic, and always dispatched to pvpanic.

> Channel 0 is (was) used for DRAM refresh, so
> it should not have any visible effect.  However, it may not be entirely
> disabling pvpanic, just making it mostly invisible.

That's good enough for the guest to reach kexec :)

</academic>

Laszlo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH] vl: allow "cont" from panicked state
  2013-08-22 10:34         ` Laszlo Ersek
@ 2013-08-22 10:36           ` Laszlo Ersek
  2013-08-22 11:35           ` Paolo Bonzini
  1 sibling, 0 replies; 22+ messages in thread
From: Laszlo Ersek @ 2013-08-22 10:36 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: pkrempa, marcel.a, libvir-list, hutao, qemu-stable, mst,
	qemu-devel, lcapitulino, rhod, kraxel, anthony, afaerber

On 08/22/13 12:34, Laszlo Ersek wrote:

> (I think I'm recalling this from the 0xCF9 reset control register, which
> falls into the [0xCF8..0xCFA] range.)

[0xCF8..0xCFB], sigh

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH] vl: allow "cont" from panicked state
  2013-08-22 10:34         ` Laszlo Ersek
  2013-08-22 10:36           ` Laszlo Ersek
@ 2013-08-22 11:35           ` Paolo Bonzini
  1 sibling, 0 replies; 22+ messages in thread
From: Paolo Bonzini @ 2013-08-22 11:35 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: pkrempa, marcel.a, libvir-list, hutao, qemu-stable, mst,
	qemu-devel, lcapitulino, rhod, kraxel, anthony, afaerber

Il 22/08/2013 12:34, Laszlo Ersek ha scritto:
> <academic>

Actually it's fine to clarify these things!  Hence the longish
digression below.

> I think before priority comes into the picture, the access size would
> matter first, no?
> 
> (I think I'm recalling this from the 0xCF9 reset control register, which
> falls into the [0xCF8..0xCFA] range.)

The base address is what matters.  A 2- or 4-byte access to x will
always go to the region that includes address x, even if there are other
regions between x and respectively x+1 or x+3.  So an access to 0xCF8
will go to the PCI address register, while an access to 0xCF9 will
always go to the reset control register.

This happens in address_space_translate_internal:

    diff = int128_sub(section->mr->size, int128_make64(addr));

For a write to 0xCF8, addr would be 0 (it is relative to the base of the
MemoryRegion).  section->size would be 1 because the next section starts
at 0xCF9.  However, section->mr->size would be 4 as specified e.g. in
i440fx_pcihost_initfn:

    memory_region_init_io(&s->conf_mem, obj, &pci_host_conf_le_ops, s,
                          "pci-conf-idx", 4);

Using section->size would be wrong---it would attempt a 1-byte write to
0xCF8, another 1-byte write to 0xCF9, and a 2-byte write to 0xCFA.
section->mr->size instead does a single write to 0xCF8, the same as on
real hardware.

BTW, the behavior changed slightly in QEMU 1.6 for 8-byte accesses. All
such accesses were split to two 4-byte accesses before; now the maximum
size of a "direct" MMIO operation (the data bus size, effectively) is 64
bits, so a 64-bit write will always address a single MemoryRegion.

For example, say you had the PCI address and data registers occupying
two separate 4-byte MemoryRegions in 8 consecutive bytes of memory.  In
1.5 you could write both of them with a single 64-bit write.  In 1.6,
this would only write four bytes to the first MemoryRegion.  This
matches hardware more closely, and is really unlikely to be a problem: a
target with 32-bit data bus probably would not have 64-bit CPU registers
to begin with.  If it did, it would resemble the architecture of the
80386SX or 8088 processors.

> Unless ioport 0 is accessed with width 1 for dma-chan purposes, I think
> such an access would be unique to pvpanic, and always dispatched to pvpanic.

It is:

static const MemoryRegionOps channel_io_ops = {
    .read = read_chan,
    .write = write_chan,
    .endianness = DEVICE_NATIVE_ENDIAN,
    .impl = {
        .min_access_size = 1,
        .max_access_size = 1,
    },
};

>> Channel 0 is (was) used for DRAM refresh, so
>> it should not have any visible effect.  However, it may not be entirely
>> disabling pvpanic, just making it mostly invisible.
> 
> That's good enough for the guest to reach kexec :)

Yes, I cannot deny that. :)

Paolo

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2013-08-22 11:36 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-08-21 12:01 [Qemu-devel] [PATCH] vl: allow "cont" from panicked state Paolo Bonzini
2013-08-21 12:42 ` Laszlo Ersek
2013-08-21 12:43   ` Paolo Bonzini
2013-08-21 14:17     ` Luiz Capitulino
2013-08-21 14:30       ` Michael S. Tsirkin
2013-08-21 14:37         ` Paolo Bonzini
2013-08-21 14:58           ` Michael S. Tsirkin
2013-08-21 15:07             ` Paolo Bonzini
2013-08-21 13:32   ` Michael S. Tsirkin
2013-08-21 13:30 ` Michael S. Tsirkin
2013-08-21 13:46   ` Paolo Bonzini
2013-08-21 14:11 ` Luiz Capitulino
2013-08-21 15:23 ` Eric Blake
2013-08-21 15:32   ` Paolo Bonzini
2013-08-21 15:44     ` Michael S. Tsirkin
2013-08-22  8:38     ` Laszlo Ersek
2013-08-22  9:19       ` Paolo Bonzini
2013-08-22  9:37         ` Michael S. Tsirkin
2013-08-22  9:52           ` Paolo Bonzini
2013-08-22 10:34         ` Laszlo Ersek
2013-08-22 10:36           ` Laszlo Ersek
2013-08-22 11:35           ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).