qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Paolo Bonzini <pbonzini@redhat.com>
To: qemu-devel@nongnu.org
Cc: Markus Armbruster <armbru@redhat.com>
Subject: [Qemu-devel] [PULL 02/30] cpus: Fix event order on resume of stopped guest
Date: Wed,  9 May 2018 00:14:19 +0200	[thread overview]
Message-ID: <1525817687-34620-3-git-send-email-pbonzini@redhat.com> (raw)
In-Reply-To: <1525817687-34620-1-git-send-email-pbonzini@redhat.com>

From: Markus Armbruster <armbru@redhat.com>

When resume of a stopped guest immediately runs into block device
errors, the BLOCK_IO_ERROR event is sent before the RESUME event.

Reproducer:

1. Create a scratch image
   $ dd if=/dev/zero of=scratch.img bs=1M count=100

   Size doesn't actually matter.

2. Prepare blkdebug configuration:

   $ cat >blkdebug.conf <<EOF
   [inject-error]
   event = "write_aio"
   errno = "5"
   EOF

   Note that errno 5 is EIO.

3. Run a guest with an additional scratch disk, i.e. with additional
   arguments
   -drive if=none,id=scratch-drive,format=raw,werror=stop,file=blkdebug:blkdebug.conf:scratch.img
   -device virtio-blk-pci,id=scratch,drive=scratch-drive

   The blkdebug part makes all writes to the scratch drive fail with
   EIO.  The werror=stop pauses the guest on write errors.

4. Connect to the QMP socket e.g. like this:
   $ socat UNIX:/your/qmp/socket READLINE,history=$HOME/.qmp_history,prompt='QMP> '

   Issue QMP command 'qmp_capabilities':
   QMP> { "execute": "qmp_capabilities" }

5. Boot the guest.

6. In the guest, write to the scratch disk, e.g. like this:

   # dd if=/dev/zero of=/dev/vdb count=1

   Do double-check the device specified with of= is actually the
   scratch device!

7. Issue QMP command 'cont':
   QMP> { "execute": "cont" }

After step 6, I get a BLOCK_IO_ERROR event followed by a STOP event.  Good.

After step 7, I get BLOCK_IO_ERROR, then RESUME, then STOP.  Not so
good; I'd expect RESUME, then BLOCK_IO_ERROR, then STOP.

The funny event order confuses libvirt: virsh -r domstate DOMAIN
--reason reports "paused (unknown)" rather than "paused (I/O error)".

The culprit is vm_prepare_start().

    /* Ensure that a STOP/RESUME pair of events is emitted if a
     * vmstop request was pending.  The BLOCK_IO_ERROR event, for
     * example, according to documentation is always followed by
     * the STOP event.
     */
    if (runstate_is_running()) {
        qapi_event_send_stop(&error_abort);
        res = -1;
    } else {
        replay_enable_events();
        cpu_enable_ticks();
        runstate_set(RUN_STATE_RUNNING);
        vm_state_notify(1, RUN_STATE_RUNNING);
    }

    /* We are sending this now, but the CPUs will be resumed shortly later */
    qapi_event_send_resume(&error_abort);
    return res;

When resuming a stopped guest, we take the else branch before we get
to sending RESUME.  vm_state_notify() runs virtio_vmstate_change(),
among other things.  This restarts I/O, triggering the BLOCK_IO_ERROR
event.

Reshuffle vm_prepare_start() to send the RESUME event earlier.

Fixes RHBZ 1566153.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180423084518.2426-1-armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 cpus.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/cpus.c b/cpus.c
index 5bcd3ec..be3a4eb 100644
--- a/cpus.c
+++ b/cpus.c
@@ -2043,7 +2043,6 @@ int vm_stop(RunState state)
 int vm_prepare_start(void)
 {
     RunState requested;
-    int res = 0;
 
     qemu_vmstop_requested(&requested);
     if (runstate_is_running() && requested == RUN_STATE__MAX) {
@@ -2057,17 +2056,18 @@ int vm_prepare_start(void)
      */
     if (runstate_is_running()) {
         qapi_event_send_stop(&error_abort);
-        res = -1;
-    } else {
-        replay_enable_events();
-        cpu_enable_ticks();
-        runstate_set(RUN_STATE_RUNNING);
-        vm_state_notify(1, RUN_STATE_RUNNING);
+        qapi_event_send_resume(&error_abort);
+        return -1;
     }
 
     /* We are sending this now, but the CPUs will be resumed shortly later */
     qapi_event_send_resume(&error_abort);
-    return res;
+
+    replay_enable_events();
+    cpu_enable_ticks();
+    runstate_set(RUN_STATE_RUNNING);
+    vm_state_notify(1, RUN_STATE_RUNNING);
+    return 0;
 }
 
 void vm_start(void)
-- 
1.8.3.1

  parent reply	other threads:[~2018-05-08 22:14 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-08 22:14 [Qemu-devel] [PULL 00/30] Misc patches for 2018-05-09 Paolo Bonzini
2018-05-08 22:14 ` [Qemu-devel] [PULL 01/30] configure: recognize more rpmbuild macros Paolo Bonzini
2018-05-08 22:14 ` Paolo Bonzini [this message]
2018-05-08 22:14 ` [Qemu-devel] [PULL 03/30] cpus: tcg: fix never exiting loop on unplug Paolo Bonzini
2018-05-08 22:14 ` [Qemu-devel] [PULL 04/30] checkpatch.pl: add common glib defines to typelist Paolo Bonzini
2018-05-08 22:14 ` [Qemu-devel] [PULL 05/30] qom: allow object_get_canonical_path_component without parent Paolo Bonzini
2018-05-08 22:14 ` [Qemu-devel] [PULL 06/30] memdev: remove "id" property Paolo Bonzini
2018-05-08 22:14 ` [Qemu-devel] [PULL 07/30] exec: move memory access declarations to a common header, inline *_phys functions Paolo Bonzini
2018-05-08 22:14 ` [Qemu-devel] [PULL 08/30] exec: small changes to flatview_do_translate Paolo Bonzini
2018-05-08 22:14 ` [Qemu-devel] [PULL 09/30] exec: extract address_space_translate_iommu, fix page_mask corner case Paolo Bonzini
2018-05-08 22:14 ` [Qemu-devel] [PULL 10/30] exec: reintroduce MemoryRegion caching Paolo Bonzini
2018-05-08 22:14 ` [Qemu-devel] [PULL 11/30] qemu-thread: always keep the posix wrapper layer Paolo Bonzini
2018-05-08 22:14 ` [Qemu-devel] [PULL 12/30] update-linux-headers: drop hyperv.h Paolo Bonzini
2018-05-08 22:14 ` [Qemu-devel] [PULL 13/30] accel: use g_strsplit for parsing accelerator names Paolo Bonzini
2018-05-08 22:14 ` [Qemu-devel] [PULL 14/30] opts: don't silently truncate long parameter keys Paolo Bonzini
2018-05-09  5:46   ` Thomas Huth
2018-05-08 22:14 ` [Qemu-devel] [PULL 15/30] opts: don't silently truncate long option values Paolo Bonzini
2018-05-14 16:19   ` Peter Maydell
2018-05-14 16:23     ` Daniel P. Berrangé
2018-05-08 22:14 ` [Qemu-devel] [PULL 16/30] target/i386: sev: fix memory leaks Paolo Bonzini
2018-05-08 22:14 ` [Qemu-devel] [PULL 17/30] qemu-options: Mark -virtioconsole as deprecated Paolo Bonzini
2018-05-08 22:14 ` [Qemu-devel] [PULL 18/30] qemu-options: Remove remainders of the -tdf option Paolo Bonzini
2018-05-08 22:14 ` [Qemu-devel] [PULL 19/30] qemu-options: Bail out on unsupported options instead of silently ignoring them Paolo Bonzini
2018-05-08 22:14 ` [Qemu-devel] [PULL 20/30] qemu-options: Remove deprecated -no-kvm-pit-reinjection Paolo Bonzini
2018-05-08 22:14 ` [Qemu-devel] [PULL 21/30] qemu-options: Remove deprecated -no-kvm-irqchip Paolo Bonzini
2018-05-08 22:14 ` [Qemu-devel] [PULL 22/30] qemu-doc: provide details of supported build platforms Paolo Bonzini
2018-05-08 22:14 ` [Qemu-devel] [PULL 23/30] glib: bump min required glib library version to 2.42 Paolo Bonzini
2018-05-08 22:14 ` [Qemu-devel] [PULL 24/30] i386/kvm: add support for Hyper-V reenlightenment MSRs Paolo Bonzini
2018-05-08 22:14 ` [Qemu-devel] [PULL 25/30] configure: Really use local libfdt if the system one is too old Paolo Bonzini
2018-05-08 22:14 ` [Qemu-devel] [PULL 26/30] configure: Display if libfdt is from system or git Paolo Bonzini
2018-05-08 22:14 ` [Qemu-devel] [PULL 27/30] shippable: Remove Debian 8 libfdt kludge Paolo Bonzini
2018-05-08 22:14 ` [Qemu-devel] [PULL 28/30] build: Silence dtc directory creation Paolo Bonzini
2018-05-08 22:14 ` [Qemu-devel] [PULL 29/30] pc-dimm: fix error messages if no slots were defined Paolo Bonzini
2018-05-08 22:14 ` [Qemu-devel] [PULL 30/30] rename included C files to foo.inc.c, remove osdep.h Paolo Bonzini
2018-05-11 12:19 ` [Qemu-devel] [PULL 00/30] Misc patches for 2018-05-09 Peter Maydell
2018-05-11 12:33   ` Paolo Bonzini
2018-05-11 12:39     ` Peter Maydell
2018-05-11 12:42   ` Daniel P. Berrangé
2018-05-11 12:50     ` Peter Maydell
2018-05-11 12:54       ` Daniel P. Berrangé

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1525817687-34620-3-git-send-email-pbonzini@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=armbru@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).