From: Paolo Bonzini <pbonzini@redhat.com>
To: qemu-devel@nongnu.org
Cc: Hyman Huang <huangy81@chinatelecom.cn>, Peter Xu <peterx@redhat.com>
Subject: [PULL 02/30] memory: Fix qemu crash on starting dirty log twice with stopped VM
Date: Tue, 15 Feb 2022 10:31:55 +0100	[thread overview]
Message-ID: <20220215093223.110827-3-pbonzini@redhat.com> (raw)
In-Reply-To: <20220215093223.110827-1-pbonzini@redhat.com>
From: Peter Xu <peterx@redhat.com>
QEMU can now easily crash with two continuous migration carried out:
(qemu) migrate -d exec:cat>out
(qemu) migrate_cancel
(qemu) migrate -d exec:cat>out
[crash] ../softmmu/memory.c:2782: memory_global_dirty_log_start: Assertion
`!(global_dirty_tracking & flags)' failed.
It's because memory API provides a way to postpone dirty log stop if the VM is
stopped, and that'll be re-done until the next VM start.  It was added in 2017
with commit 1931076077 ("migration: optimize the downtime", 2017-08-01).
However the recent work on allowing dirty tracking to be bitmask broke it,
which is commit 63b41db4bc ("memory: make global_dirty_tracking a bitmask",
2021-11-01).
The fix proposed in this patch contains two things:
  (1) Instead of passing over the flags to postpone stop dirty track, we add a
      global variable (along with current vmstate_change variable) to record
      what flags to stop dirty tracking.
  (2) When start dirty tracking, instead if remove the vmstate hook directly,
      we also execute the postponed stop process so that we make sure all the
      starts and stops will be paired.
This procedure is overlooked in the bitmask-ify work in 2021.
Cc: Hyman Huang <huangy81@chinatelecom.cn>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2044818
Fixes: 63b41db4bc ("memory: make global_dirty_tracking a bitmask")
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20220207123019.27223-1-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 softmmu/memory.c | 65 ++++++++++++++++++++++++++++++++++--------------
 1 file changed, 47 insertions(+), 18 deletions(-)
diff --git a/softmmu/memory.c b/softmmu/memory.c
index 678dc62f06..8060c6de78 100644
--- a/softmmu/memory.c
+++ b/softmmu/memory.c
@@ -2790,19 +2790,32 @@ void memory_global_after_dirty_log_sync(void)
     MEMORY_LISTENER_CALL_GLOBAL(log_global_after_sync, Forward);
 }
 
+/*
+ * Dirty track stop flags that are postponed due to VM being stopped.  Should
+ * only be used within vmstate_change hook.
+ */
+static unsigned int postponed_stop_flags;
 static VMChangeStateEntry *vmstate_change;
+static void memory_global_dirty_log_stop_postponed_run(void);
 
 void memory_global_dirty_log_start(unsigned int flags)
 {
-    unsigned int old_flags = global_dirty_tracking;
-
-    if (vmstate_change) {
-        qemu_del_vm_change_state_handler(vmstate_change);
-        vmstate_change = NULL;
-    }
+    unsigned int old_flags;
 
     assert(flags && !(flags & (~GLOBAL_DIRTY_MASK)));
-    assert(!(global_dirty_tracking & flags));
+
+    if (vmstate_change) {
+        /* If there is postponed stop(), operate on it first */
+        postponed_stop_flags &= ~flags;
+        memory_global_dirty_log_stop_postponed_run();
+    }
+
+    flags &= ~global_dirty_tracking;
+    if (!flags) {
+        return;
+    }
+
+    old_flags = global_dirty_tracking;
     global_dirty_tracking |= flags;
     trace_global_dirty_changed(global_dirty_tracking);
 
@@ -2830,29 +2843,45 @@ static void memory_global_dirty_log_do_stop(unsigned int flags)
     }
 }
 
+/*
+ * Execute the postponed dirty log stop operations if there is, then reset
+ * everything (including the flags and the vmstate change hook).
+ */
+static void memory_global_dirty_log_stop_postponed_run(void)
+{
+    /* This must be called with the vmstate handler registered */
+    assert(vmstate_change);
+
+    /* Note: postponed_stop_flags can be cleared in log start routine */
+    if (postponed_stop_flags) {
+        memory_global_dirty_log_do_stop(postponed_stop_flags);
+        postponed_stop_flags = 0;
+    }
+
+    qemu_del_vm_change_state_handler(vmstate_change);
+    vmstate_change = NULL;
+}
+
 static void memory_vm_change_state_handler(void *opaque, bool running,
                                            RunState state)
 {
-    unsigned int flags = (unsigned int)(uintptr_t)opaque;
     if (running) {
-        memory_global_dirty_log_do_stop(flags);
-
-        if (vmstate_change) {
-            qemu_del_vm_change_state_handler(vmstate_change);
-            vmstate_change = NULL;
-        }
+        memory_global_dirty_log_stop_postponed_run();
     }
 }
 
 void memory_global_dirty_log_stop(unsigned int flags)
 {
     if (!runstate_is_running()) {
+        /* Postpone the dirty log stop, e.g., to when VM starts again */
         if (vmstate_change) {
-            return;
+            /* Batch with previous postponed flags */
+            postponed_stop_flags |= flags;
+        } else {
+            postponed_stop_flags = flags;
+            vmstate_change = qemu_add_vm_change_state_handler(
+                memory_vm_change_state_handler, NULL);
         }
-        vmstate_change = qemu_add_vm_change_state_handler(
-                                memory_vm_change_state_handler,
-                                (void *)(uintptr_t)flags);
         return;
     }
 
-- 
2.34.1
next prev parent reply	other threads:[~2022-02-15  9:35 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-15  9:31 [PULL 00/30] Misc mostly build system patches for 2022-02-15 Paolo Bonzini
2022-02-15  9:31 ` [PULL 01/30] target/i386: add TCG support for UMIP Paolo Bonzini
2022-02-15  9:31 ` Paolo Bonzini [this message]
2022-02-15  9:31 ` [PULL 03/30] tests/qemu-iotests/testrunner: Print diff to stderr in TAP mode Paolo Bonzini
2022-02-15  9:31 ` [PULL 04/30] meson: use .allowed() method for features Paolo Bonzini
2022-02-15  9:31 ` [PULL 05/30] meson: use .require() and .disable_auto_if() " Paolo Bonzini
2022-02-15  9:31 ` [PULL 06/30] configure, meson: move AVX tests to meson Paolo Bonzini
2022-02-15  9:32 ` [PULL 07/30] configure, meson: move membarrier test " Paolo Bonzini
2022-02-15  9:32 ` [PULL 08/30] configure, meson: move AF_ALG " Paolo Bonzini
2022-02-15  9:32 ` [PULL 09/30] configure, meson: move libnuma detection " Paolo Bonzini
2022-02-15  9:32 ` [PULL 10/30] configure, meson: move TPM check " Paolo Bonzini
2022-02-15  9:32 ` [PULL 11/30] configure, meson: cleanup qemu-ga libraries Paolo Bonzini
2022-02-15  9:32 ` [PULL 12/30] configure, meson: move image format options to meson_options.txt Paolo Bonzini
2022-02-15  9:32 ` [PULL 13/30] configure, meson: move block layer " Paolo Bonzini
2022-02-15  9:32 ` [PULL 14/30] meson: define qemu_cflags/qemu_ldflags Paolo Bonzini
2022-02-15  9:32 ` [PULL 15/30] configure, meson: move some default-disabled options to meson_options.txt Paolo Bonzini
2023-04-11  9:42   ` Peter Maydell
2022-02-15  9:32 ` [PULL 16/30] configure, meson: move coroutine " Paolo Bonzini
2022-02-15  9:32 ` [PULL 17/30] configure, meson: move smbd " Paolo Bonzini
2022-02-15  9:32 ` [PULL 18/30] configure, meson: move guest-agent, tools to meson Paolo Bonzini
2022-03-17 22:34   ` Brad Smith
2022-02-15  9:32 ` [PULL 19/30] meson: refine check for whether to look for virglrenderer Paolo Bonzini
2022-02-15  9:32 ` [PULL 20/30] configure, meson: move OpenGL check to meson Paolo Bonzini
2022-02-15  9:32 ` [PULL 21/30] qga/vss-win32: fix midl arguments Paolo Bonzini
2022-02-15  9:32 ` [PULL 22/30] meson: drop --with-win-sdk Paolo Bonzini
2022-02-15  9:32 ` [PULL 23/30] qga/vss-win32: use widl if available Paolo Bonzini
2022-02-15  9:32 ` [PULL 24/30] qga/vss: use standard windows headers location Paolo Bonzini
2022-02-15  9:32 ` [PULL 25/30] configure, meson: replace VSS SDK checks and options with --enable-vss-sdk Paolo Bonzini
2022-02-15  9:32 ` [PULL 26/30] meson: do not make qga/vss-win32/meson.build conditional on C++ presence Paolo Bonzini
2022-02-15  9:32 ` [PULL 27/30] qga/vss-win32: require widl/midl, remove pre-built TLB file Paolo Bonzini
2022-02-15  9:32 ` [PULL 28/30] meson: require dynamic linking for VSS support Paolo Bonzini
2022-02-15  9:32 ` [PULL 29/30] meson, configure: move ntddscsi API check to meson Paolo Bonzini
2022-02-15  9:32 ` [PULL 30/30] configure, meson: move CONFIG_IASL to a Meson option Paolo Bonzini
2022-02-16  9:56 ` [PULL 00/30] Misc mostly build system patches for 2022-02-15 Peter Maydell
2022-02-16 14:03   ` Paolo Bonzini
2022-02-16 14:41     ` Peter Maydell
2022-02-16 21:06       ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox
  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):
  git send-email \
    --in-reply-to=20220215093223.110827-3-pbonzini@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=huangy81@chinatelecom.cn \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY
  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
  Be sure your reply has a Subject: header at the top and a blank line
  before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).