qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH] block: asynchronously stop the VM on I/O errors
@ 2014-06-03 14:16 Paolo Bonzini
  2014-06-03 14:37 ` Kevin Wolf
  0 siblings, 1 reply; 5+ messages in thread
From: Paolo Bonzini @ 2014-06-03 14:16 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, famz, stefanha

With virtio-blk dataplane, I/O errors might occur while QEMU is
not in the main I/O thread.  However, it's invalid to call vm_stop
when we're neither in a VCPU thread nor in the main I/O thread,
even if we were to take the iothread mutex around it.

To avoid this problem, simply raise a request to the main I/O thread,
similar to what QEMU does when vm_stop is called from a CPU thread.
We know that bdrv_error_action is called from an AIO callback, and
the moment at which the callback will fire is not well-defined; it
depends on the moment at which the disk or OS finishes the operation,
which can happen at any time.

Note that QEMU is certainly not in a CPU thread and we do not need to
call cpu_stop_current() like vm_stop() does.

This makes bdrv_error_action() thread safe.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block.c         | 2 +-
 stubs/vm-stop.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/block.c b/block.c
index fc2edd3..fa41598 100644
--- a/block.c
+++ b/block.c
@@ -3515,7 +3515,7 @@ void bdrv_error_action(BlockDriverState *bs, BlockErrorAction action,
     assert(error >= 0);
     bdrv_emit_qmp_error_event(bs, QEVENT_BLOCK_IO_ERROR, action, is_read);
     if (action == BDRV_ACTION_STOP) {
-        vm_stop(RUN_STATE_IO_ERROR);
+        qemu_system_vmstop_request(RUN_STATE_IO_ERROR);
         bdrv_iostatus_set_err(bs, error);
     }
 }
diff --git a/stubs/vm-stop.c b/stubs/vm-stop.c
index f82c897..7fbeefd 100644
--- a/stubs/vm-stop.c
+++ b/stubs/vm-stop.c
@@ -1,7 +1,7 @@
 #include "qemu-common.h"
 #include "sysemu/sysemu.h"
 
-int vm_stop(RunState state)
+void qemu_system_vmstop_request(RunState state)
 {
     abort();
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [PATCH] block: asynchronously stop the VM on I/O errors
  2014-06-03 14:16 [Qemu-devel] [PATCH] block: asynchronously stop the VM on I/O errors Paolo Bonzini
@ 2014-06-03 14:37 ` Kevin Wolf
  2014-06-03 15:51   ` Paolo Bonzini
  0 siblings, 1 reply; 5+ messages in thread
From: Kevin Wolf @ 2014-06-03 14:37 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: famz, qemu-devel, stefanha

Am 03.06.2014 um 16:16 hat Paolo Bonzini geschrieben:
> With virtio-blk dataplane, I/O errors might occur while QEMU is
> not in the main I/O thread.  However, it's invalid to call vm_stop
> when we're neither in a VCPU thread nor in the main I/O thread,
> even if we were to take the iothread mutex around it.
> 
> To avoid this problem, simply raise a request to the main I/O thread,
> similar to what QEMU does when vm_stop is called from a CPU thread.
> We know that bdrv_error_action is called from an AIO callback, and
> the moment at which the callback will fire is not well-defined; it
> depends on the moment at which the disk or OS finishes the operation,
> which can happen at any time.
> 
> Note that QEMU is certainly not in a CPU thread and we do not need to
> call cpu_stop_current() like vm_stop() does.

Do I understand correctly that this is not a fundamental truth of qemu's
operation, but holds true only because the drivers that do support
rerror/werror all use bdrv_aio_readv/writev(), which guarantees that a
BH is used in error cases? Otherwise I think an I/O handler in a vcpu
thread could directly call into the block layer and fail immediately
(might happen for example if we added rerror/werror support to ATAPI).

> This makes bdrv_error_action() thread safe.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  block.c         | 2 +-
>  stubs/vm-stop.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/block.c b/block.c
> index fc2edd3..fa41598 100644
> --- a/block.c
> +++ b/block.c
> @@ -3515,7 +3515,7 @@ void bdrv_error_action(BlockDriverState *bs, BlockErrorAction action,
>      assert(error >= 0);
>      bdrv_emit_qmp_error_event(bs, QEVENT_BLOCK_IO_ERROR, action, is_read);
>      if (action == BDRV_ACTION_STOP) {
> -        vm_stop(RUN_STATE_IO_ERROR);
> +        qemu_system_vmstop_request(RUN_STATE_IO_ERROR);
>          bdrv_iostatus_set_err(bs, error);

By delaying the actual state change, does this break the invariant that
bs->iostatus is BLOCK_DEVICE_IO_STATUS_OK while the VM is running?

I know this invariant was mentioned occasionally. Not sure if anything
actually breaks when it's violated, though.

Kevin

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [PATCH] block: asynchronously stop the VM on I/O errors
  2014-06-03 14:37 ` Kevin Wolf
@ 2014-06-03 15:51   ` Paolo Bonzini
  2014-06-04  8:28     ` Kevin Wolf
  0 siblings, 1 reply; 5+ messages in thread
From: Paolo Bonzini @ 2014-06-03 15:51 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: famz, qemu-devel, stefanha

Il 03/06/2014 16:37, Kevin Wolf ha scritto:
> Am 03.06.2014 um 16:16 hat Paolo Bonzini geschrieben:
>> With virtio-blk dataplane, I/O errors might occur while QEMU is
>> not in the main I/O thread.  However, it's invalid to call vm_stop
>> when we're neither in a VCPU thread nor in the main I/O thread,
>> even if we were to take the iothread mutex around it.
>>
>> To avoid this problem, simply raise a request to the main I/O thread,
>> similar to what QEMU does when vm_stop is called from a CPU thread.
>> We know that bdrv_error_action is called from an AIO callback, and
>> the moment at which the callback will fire is not well-defined; it
>> depends on the moment at which the disk or OS finishes the operation,
>> which can happen at any time.
>>
>> Note that QEMU is certainly not in a CPU thread and we do not need to
>> call cpu_stop_current() like vm_stop() does.
> 
> Do I understand correctly that this is not a fundamental truth of qemu's
> operation, but holds true only because the drivers that do support
> rerror/werror all use bdrv_aio_readv/writev(), which guarantees that a
> BH is used in error cases? Otherwise I think an I/O handler in a vcpu
> thread could directly call into the block layer and fail immediately
> (might happen for example if we added rerror/werror support to ATAPI).
> 
> By delaying the actual state change, does this break the invariant that
> bs->iostatus is BLOCK_DEVICE_IO_STATUS_OK while the VM is running?

These two comments are actually related, in that the invariant was 
already not respected if an I/O handler in a VCPU thread could fail 
immediately.

Breaking this invariant means that you have a very small window where 
{'execute':'cont'} would actually not restart the VM.  I think this 
should be fixed by dropping the request in vm_start, like this:

diff --git a/vl.c b/vl.c
index db9ea90..09af28a 100644
--- a/vl.c
+++ b/vl.c
@@ -1721,8 +1721,31 @@ void vm_state_notify(int running, RunState state)
     }
 }
 
+static RunState vmstop_requested = RUN_STATE_MAX;
+
+/* We use RUN_STATE_MAX but any invalid value will do */
+static bool qemu_vmstop_requested(RunState *r)
+{
+    if (vmstop_requested < RUN_STATE_MAX) {
+        *r = vmstop_requested;
+        vmstop_requested = RUN_STATE_MAX;
+        return true;
+    }
+
+    return false;
+}
+
+void qemu_system_vmstop_request(RunState state)
+{
+    vmstop_requested = state;
+    qemu_notify_event();
+}
+
 void vm_start(void)
 {
+    RunState dummy;
+
+    qemu_vmstop_requested(&dummy);
     if (!runstate_is_running()) {
         cpu_enable_ticks();
         runstate_set(RUN_STATE_RUNNING);
@@ -1756,7 +1779,6 @@ static NotifierList suspend_notifiers =
 static NotifierList wakeup_notifiers =
     NOTIFIER_LIST_INITIALIZER(wakeup_notifiers);
 static uint32_t wakeup_reason_mask = ~(1 << QEMU_WAKEUP_REASON_NONE);
-static RunState vmstop_requested = RUN_STATE_MAX;
 
 int qemu_shutdown_requested_get(void)
 {
@@ -1824,18 +1846,6 @@ static int qemu_debug_requested(void)
     return r;
 }
 
-/* We use RUN_STATE_MAX but any invalid value will do */
-static bool qemu_vmstop_requested(RunState *r)
-{
-    if (vmstop_requested < RUN_STATE_MAX) {
-        *r = vmstop_requested;
-        vmstop_requested = RUN_STATE_MAX;
-        return true;
-    }
-
-    return false;
-}
-
 void qemu_register_reset(QEMUResetHandler *func, void *opaque)
 {
     QEMUResetEntry *re = g_malloc0(sizeof(QEMUResetEntry));
@@ -1985,12 +1995,6 @@ void qemu_system_debug_request(void)
     qemu_notify_event();
 }
 
-void qemu_system_vmstop_request(RunState state)
-{
-    vmstop_requested = state;
-    qemu_notify_event();
-}
-
 static bool main_loop_should_exit(void)
 {
     RunState r;


Also, I think that bdrv_emit_qmp_error_event is placed wrong.
It should be called only after setting the iostatus, otherwise
there is a small window where the iostatus is "no error" but
the event has been generated already.

Paolo

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [PATCH] block: asynchronously stop the VM on I/O errors
  2014-06-03 15:51   ` Paolo Bonzini
@ 2014-06-04  8:28     ` Kevin Wolf
  2014-06-04  9:04       ` Paolo Bonzini
  0 siblings, 1 reply; 5+ messages in thread
From: Kevin Wolf @ 2014-06-04  8:28 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: famz, qemu-devel, stefanha

Am 03.06.2014 um 17:51 hat Paolo Bonzini geschrieben:
> Il 03/06/2014 16:37, Kevin Wolf ha scritto:
> > Am 03.06.2014 um 16:16 hat Paolo Bonzini geschrieben:
> >> With virtio-blk dataplane, I/O errors might occur while QEMU is
> >> not in the main I/O thread.  However, it's invalid to call vm_stop
> >> when we're neither in a VCPU thread nor in the main I/O thread,
> >> even if we were to take the iothread mutex around it.
> >>
> >> To avoid this problem, simply raise a request to the main I/O thread,
> >> similar to what QEMU does when vm_stop is called from a CPU thread.
> >> We know that bdrv_error_action is called from an AIO callback, and
> >> the moment at which the callback will fire is not well-defined; it
> >> depends on the moment at which the disk or OS finishes the operation,
> >> which can happen at any time.
> >>
> >> Note that QEMU is certainly not in a CPU thread and we do not need to
> >> call cpu_stop_current() like vm_stop() does.
> > 
> > Do I understand correctly that this is not a fundamental truth of qemu's
> > operation, but holds true only because the drivers that do support
> > rerror/werror all use bdrv_aio_readv/writev(), which guarantees that a
> > BH is used in error cases? Otherwise I think an I/O handler in a vcpu
> > thread could directly call into the block layer and fail immediately
> > (might happen for example if we added rerror/werror support to ATAPI).
> > 
> > By delaying the actual state change, does this break the invariant that
> > bs->iostatus is BLOCK_DEVICE_IO_STATUS_OK while the VM is running?
> 
> These two comments are actually related, in that the invariant was 
> already not respected if an I/O handler in a VCPU thread could fail 
> immediately.

Oh, right, I somehow expected that vm_stop() waits for the CPU to be
stopped before it returns, but that's not what it does.

> Breaking this invariant means that you have a very small window where 
> {'execute':'cont'} would actually not restart the VM.  I think this 
> should be fixed by dropping the request in vm_start, like this:
> [...]

Sounds like an option. Do we need to send a QEVENT_STOP/QEVENT_RESUME
pair? If we don't, the client will still notice a difference to a real
stop and resume.

> Also, I think that bdrv_emit_qmp_error_event is placed wrong.
> It should be called only after setting the iostatus, otherwise
> there is a small window where the iostatus is "no error" but
> the event has been generated already.

Yes, I agree.

The documentation for this event actually answers my above question:

    Note: If action is "stop", a STOP event will eventually follow the
    BLOCK_IO_ERROR event.

Perhaps we should also change the documentation of the "stop" value to
clarify that the VM may not actually be stopped yet. It currently reads
like this:

    "stop": error caused VM to be stopped

Kevin

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [PATCH] block: asynchronously stop the VM on I/O errors
  2014-06-04  8:28     ` Kevin Wolf
@ 2014-06-04  9:04       ` Paolo Bonzini
  0 siblings, 0 replies; 5+ messages in thread
From: Paolo Bonzini @ 2014-06-04  9:04 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: famz, qemu-devel, stefanha

Il 04/06/2014 10:28, Kevin Wolf ha scritto:
>> Breaking this invariant means that you have a very small window where
>> {'execute':'cont'} would actually not restart the VM.  I think this
>> should be fixed by dropping the request in vm_start, like this:
>> [...]
>
> Sounds like an option. Do we need to send a QEVENT_STOP/QEVENT_RESUME
> pair? If we don't, the client will still notice a difference to a real
> stop and resume.

Yes, better do that.

>> Also, I think that bdrv_emit_qmp_error_event is placed wrong.
>> It should be called only after setting the iostatus, otherwise
>> there is a small window where the iostatus is "no error" but
>> the event has been generated already.
>
> Yes, I agree.
>
> The documentation for this event actually answers my above question:
>
>     Note: If action is "stop", a STOP event will eventually follow the
>     BLOCK_IO_ERROR event.
>
> Perhaps we should also change the documentation of the "stop" value to
> clarify that the VM may not actually be stopped yet. It currently reads
> like this:
>
>     "stop": error caused VM to be stopped

Yes.

Paolo

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-06-04  9:04 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-06-03 14:16 [Qemu-devel] [PATCH] block: asynchronously stop the VM on I/O errors Paolo Bonzini
2014-06-03 14:37 ` Kevin Wolf
2014-06-03 15:51   ` Paolo Bonzini
2014-06-04  8:28     ` Kevin Wolf
2014-06-04  9:04       ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).