From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:37302) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WsXBR-0002Sh-I1 for qemu-devel@nongnu.org; Thu, 05 Jun 2014 08:54:26 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WsXBI-0007dW-Gr for qemu-devel@nongnu.org; Thu, 05 Jun 2014 08:54:17 -0400 Received: from mail-wi0-x236.google.com ([2a00:1450:400c:c05::236]:59687) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WsXBI-0007cc-5T for qemu-devel@nongnu.org; Thu, 05 Jun 2014 08:54:08 -0400 Received: by mail-wi0-f182.google.com with SMTP id r20so3412730wiv.3 for ; Thu, 05 Jun 2014 05:54:07 -0700 (PDT) Sender: Paolo Bonzini From: Paolo Bonzini Date: Thu, 5 Jun 2014 14:53:59 +0200 Message-Id: <1401972839-25213-3-git-send-email-pbonzini@redhat.com> In-Reply-To: <1401972839-25213-1-git-send-email-pbonzini@redhat.com> References: <1401972839-25213-1-git-send-email-pbonzini@redhat.com> Subject: [Qemu-devel] [PATCH v2 2/2] block: asynchronously stop the VM on I/O errors List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Cc: kwolf@redhat.com, famz@redhat.com, stefanha@redhat.com With virtio-blk dataplane, I/O errors might occur while QEMU is not in the main I/O thread. However, it's invalid to call vm_stop when we're neither in a VCPU thread nor in the main I/O thread, even if we were to take the iothread mutex around it. To avoid this problem, we can raise a request to the main I/O thread, similar to what QEMU does when vm_stop is called from a CPU thread. We know that bdrv_error_action is called from an AIO callback, and the moment at which the callback will fire is not well-defined; it depends on the moment at which the disk or OS finishes the operation, which can happen at any time. Note that QEMU is certainly not in a CPU thread and we do not need to call cpu_stop_current() like vm_stop() does. However, we need to ensure that any action taken by management will result in correct detection of the error _and_ a running VM. In particular: - the event must be raised after the iostatus has been set, so that "info block" will return an iostatus that matches the event. - the VM must be stopped after the iostatus has been set, so that "info block" will return an iostatus that matches the runstate. The ordering between the STOP and BLOCK_IO_ERROR events is preserved; BLOCK_IO_ERROR is documented to come first. This makes bdrv_error_action() thread safe (assuming QMP events are, which is attacked by a separate series). Signed-off-by: Paolo Bonzini --- block.c | 20 ++++++++++++++++++-- docs/qmp/qmp-events.txt | 2 +- stubs/vm-stop.c | 7 ++++++- 3 files changed, 25 insertions(+), 4 deletions(-) diff --git a/block.c b/block.c index 17f763d..32082f6 100644 --- a/block.c +++ b/block.c @@ -3629,10 +3629,27 @@ void bdrv_error_action(BlockDriverState *bs, BlockErrorAction action, bool is_read, int error) { assert(error >= 0); - bdrv_emit_qmp_error_event(bs, QEVENT_BLOCK_IO_ERROR, action, is_read); + if (action == BDRV_ACTION_STOP) { - vm_stop(RUN_STATE_IO_ERROR); + /* First set the iostatus, so that "info block" returns an iostatus + * that matches the events raised so far (an additional error iostatus + * is fine, but not a lost one). + */ bdrv_iostatus_set_err(bs, error); + + /* Then raise the request to stop the VM and the event. + * qemu_system_vmstop_request_prepare has two effects. First, + * it ensures that the STOP event always comes after the + * BLOCK_IO_ERROR event. Second, it ensures that even if management + * can observe the STOP event and do a "cont" before the STOP + * event is issued, the VM will not stop. In this case, vm_start() + * also ensures that the STOP/RESUME pair of events is emitted. + */ + qemu_system_vmstop_request_prepare(); + bdrv_emit_qmp_error_event(bs, QEVENT_BLOCK_IO_ERROR, action, is_read); + qemu_system_vmstop_request(RUN_STATE_IO_ERROR); + } else { + bdrv_emit_qmp_error_event(bs, QEVENT_BLOCK_IO_ERROR, action, is_read); } } diff --git a/docs/qmp/qmp-events.txt b/docs/qmp/qmp-events.txt index 145402e..849ec9d 100644 --- a/docs/qmp/qmp-events.txt +++ b/docs/qmp/qmp-events.txt @@ -52,7 +52,7 @@ Data: - "action": action that has been taken, it's one of the following (json-string): "ignore": error has been ignored "report": error has been reported to the device - "stop": error caused VM to be stopped + "stop": the VM is going to stop because of the error Example: diff --git a/stubs/vm-stop.c b/stubs/vm-stop.c index f82c897..69fd86b 100644 --- a/stubs/vm-stop.c +++ b/stubs/vm-stop.c @@ -1,7 +1,12 @@ #include "qemu-common.h" #include "sysemu/sysemu.h" -int vm_stop(RunState state) +void qemu_system_vmstop_request_prepare(void) +{ + abort(); +} + +void qemu_system_vmstop_request(RunState state) { abort(); } -- 1.8.3.1