* [PATCH v2] memory: Fix qemu crash on starting dirty log twice with stopped VM
@ 2022-02-07 12:30 Peter Xu
2022-02-08 11:15 ` Paolo Bonzini
0 siblings, 1 reply; 2+ messages in thread
From: Peter Xu @ 2022-02-07 12:30 UTC (permalink / raw)
To: qemu-devel
Cc: Juan Quintela, David Hildenbrand, Hyman Huang,
Dr . David Alan Gilbert, peterx, Philippe Mathieu-Daudé,
Leonardo Bras Soares Passos, Paolo Bonzini
QEMU can now easily crash with two continuous migration carried out:
(qemu) migrate -d exec:cat>out
(qemu) migrate_cancel
(qemu) migrate -d exec:cat>out
[crash] ../softmmu/memory.c:2782: memory_global_dirty_log_start: Assertion
`!(global_dirty_tracking & flags)' failed.
It's because memory API provides a way to postpone dirty log stop if the VM is
stopped, and that'll be re-done until the next VM start. It was added in 2017
with commit 1931076077 ("migration: optimize the downtime", 2017-08-01).
However the recent work on allowing dirty tracking to be bitmask broke it,
which is commit 63b41db4bc ("memory: make global_dirty_tracking a bitmask",
2021-11-01).
The fix proposed in this patch contains two things:
(1) Instead of passing over the flags to postpone stop dirty track, we add a
global variable (along with current vmstate_change variable) to record
what flags to stop dirty tracking.
(2) When start dirty tracking, instead if remove the vmstate hook directly,
we also execute the postponed stop process so that we make sure all the
starts and stops will be paired.
This procedure is overlooked in the bitmask-ify work in 2021.
Cc: Hyman Huang <huangy81@chinatelecom.cn>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2044818
Fixes: 63b41db4bc ("memory: make global_dirty_tracking a bitmask")
Signed-off-by: Peter Xu <peterx@redhat.com>
---
softmmu/memory.c | 61 +++++++++++++++++++++++++++++++++++-------------
1 file changed, 45 insertions(+), 16 deletions(-)
diff --git a/softmmu/memory.c b/softmmu/memory.c
index 678dc62f06..8060c6de78 100644
--- a/softmmu/memory.c
+++ b/softmmu/memory.c
@@ -2790,19 +2790,32 @@ void memory_global_after_dirty_log_sync(void)
MEMORY_LISTENER_CALL_GLOBAL(log_global_after_sync, Forward);
}
+/*
+ * Dirty track stop flags that are postponed due to VM being stopped. Should
+ * only be used within vmstate_change hook.
+ */
+static unsigned int postponed_stop_flags;
static VMChangeStateEntry *vmstate_change;
+static void memory_global_dirty_log_stop_postponed_run(void);
void memory_global_dirty_log_start(unsigned int flags)
{
- unsigned int old_flags = global_dirty_tracking;
+ unsigned int old_flags;
+
+ assert(flags && !(flags & (~GLOBAL_DIRTY_MASK)));
if (vmstate_change) {
- qemu_del_vm_change_state_handler(vmstate_change);
- vmstate_change = NULL;
+ /* If there is postponed stop(), operate on it first */
+ postponed_stop_flags &= ~flags;
+ memory_global_dirty_log_stop_postponed_run();
}
- assert(flags && !(flags & (~GLOBAL_DIRTY_MASK)));
- assert(!(global_dirty_tracking & flags));
+ flags &= ~global_dirty_tracking;
+ if (!flags) {
+ return;
+ }
+
+ old_flags = global_dirty_tracking;
global_dirty_tracking |= flags;
trace_global_dirty_changed(global_dirty_tracking);
@@ -2830,29 +2843,45 @@ static void memory_global_dirty_log_do_stop(unsigned int flags)
}
}
+/*
+ * Execute the postponed dirty log stop operations if there is, then reset
+ * everything (including the flags and the vmstate change hook).
+ */
+static void memory_global_dirty_log_stop_postponed_run(void)
+{
+ /* This must be called with the vmstate handler registered */
+ assert(vmstate_change);
+
+ /* Note: postponed_stop_flags can be cleared in log start routine */
+ if (postponed_stop_flags) {
+ memory_global_dirty_log_do_stop(postponed_stop_flags);
+ postponed_stop_flags = 0;
+ }
+
+ qemu_del_vm_change_state_handler(vmstate_change);
+ vmstate_change = NULL;
+}
+
static void memory_vm_change_state_handler(void *opaque, bool running,
RunState state)
{
- unsigned int flags = (unsigned int)(uintptr_t)opaque;
if (running) {
- memory_global_dirty_log_do_stop(flags);
-
- if (vmstate_change) {
- qemu_del_vm_change_state_handler(vmstate_change);
- vmstate_change = NULL;
- }
+ memory_global_dirty_log_stop_postponed_run();
}
}
void memory_global_dirty_log_stop(unsigned int flags)
{
if (!runstate_is_running()) {
+ /* Postpone the dirty log stop, e.g., to when VM starts again */
if (vmstate_change) {
- return;
+ /* Batch with previous postponed flags */
+ postponed_stop_flags |= flags;
+ } else {
+ postponed_stop_flags = flags;
+ vmstate_change = qemu_add_vm_change_state_handler(
+ memory_vm_change_state_handler, NULL);
}
- vmstate_change = qemu_add_vm_change_state_handler(
- memory_vm_change_state_handler,
- (void *)(uintptr_t)flags);
return;
}
--
2.32.0
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH v2] memory: Fix qemu crash on starting dirty log twice with stopped VM
2022-02-07 12:30 [PATCH v2] memory: Fix qemu crash on starting dirty log twice with stopped VM Peter Xu
@ 2022-02-08 11:15 ` Paolo Bonzini
0 siblings, 0 replies; 2+ messages in thread
From: Paolo Bonzini @ 2022-02-08 11:15 UTC (permalink / raw)
To: Peter Xu, qemu-devel
Cc: David Hildenbrand, Juan Quintela, Hyman Huang,
Philippe Mathieu-Daudé, Dr . David Alan Gilbert,
Leonardo Bras Soares Passos
On 2/7/22 13:30, Peter Xu wrote:
> QEMU can now easily crash with two continuous migration carried out:
>
> (qemu) migrate -d exec:cat>out
> (qemu) migrate_cancel
> (qemu) migrate -d exec:cat>out
> [crash] ../softmmu/memory.c:2782: memory_global_dirty_log_start: Assertion
> `!(global_dirty_tracking & flags)' failed.
>
> It's because memory API provides a way to postpone dirty log stop if the VM is
> stopped, and that'll be re-done until the next VM start. It was added in 2017
> with commit 1931076077 ("migration: optimize the downtime", 2017-08-01).
>
> However the recent work on allowing dirty tracking to be bitmask broke it,
> which is commit 63b41db4bc ("memory: make global_dirty_tracking a bitmask",
> 2021-11-01).
>
> The fix proposed in this patch contains two things:
>
> (1) Instead of passing over the flags to postpone stop dirty track, we add a
> global variable (along with current vmstate_change variable) to record
> what flags to stop dirty tracking.
>
> (2) When start dirty tracking, instead if remove the vmstate hook directly,
> we also execute the postponed stop process so that we make sure all the
> starts and stops will be paired.
>
> This procedure is overlooked in the bitmask-ify work in 2021.
>
> Cc: Hyman Huang <huangy81@chinatelecom.cn>
> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2044818
> Fixes: 63b41db4bc ("memory: make global_dirty_tracking a bitmask")
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
> softmmu/memory.c | 61 +++++++++++++++++++++++++++++++++++-------------
> 1 file changed, 45 insertions(+), 16 deletions(-)
>
> diff --git a/softmmu/memory.c b/softmmu/memory.c
> index 678dc62f06..8060c6de78 100644
> --- a/softmmu/memory.c
> +++ b/softmmu/memory.c
> @@ -2790,19 +2790,32 @@ void memory_global_after_dirty_log_sync(void)
> MEMORY_LISTENER_CALL_GLOBAL(log_global_after_sync, Forward);
> }
>
> +/*
> + * Dirty track stop flags that are postponed due to VM being stopped. Should
> + * only be used within vmstate_change hook.
> + */
> +static unsigned int postponed_stop_flags;
> static VMChangeStateEntry *vmstate_change;
> +static void memory_global_dirty_log_stop_postponed_run(void);
>
> void memory_global_dirty_log_start(unsigned int flags)
> {
> - unsigned int old_flags = global_dirty_tracking;
> + unsigned int old_flags;
> +
> + assert(flags && !(flags & (~GLOBAL_DIRTY_MASK)));
>
> if (vmstate_change) {
> - qemu_del_vm_change_state_handler(vmstate_change);
> - vmstate_change = NULL;
> + /* If there is postponed stop(), operate on it first */
> + postponed_stop_flags &= ~flags;
> + memory_global_dirty_log_stop_postponed_run();
> }
>
> - assert(flags && !(flags & (~GLOBAL_DIRTY_MASK)));
> - assert(!(global_dirty_tracking & flags));
> + flags &= ~global_dirty_tracking;
> + if (!flags) {
> + return;
> + }
> +
> + old_flags = global_dirty_tracking;
> global_dirty_tracking |= flags;
> trace_global_dirty_changed(global_dirty_tracking);
>
> @@ -2830,29 +2843,45 @@ static void memory_global_dirty_log_do_stop(unsigned int flags)
> }
> }
>
> +/*
> + * Execute the postponed dirty log stop operations if there is, then reset
> + * everything (including the flags and the vmstate change hook).
> + */
> +static void memory_global_dirty_log_stop_postponed_run(void)
> +{
> + /* This must be called with the vmstate handler registered */
> + assert(vmstate_change);
> +
> + /* Note: postponed_stop_flags can be cleared in log start routine */
> + if (postponed_stop_flags) {
> + memory_global_dirty_log_do_stop(postponed_stop_flags);
> + postponed_stop_flags = 0;
> + }
> +
> + qemu_del_vm_change_state_handler(vmstate_change);
> + vmstate_change = NULL;
> +}
> +
> static void memory_vm_change_state_handler(void *opaque, bool running,
> RunState state)
> {
> - unsigned int flags = (unsigned int)(uintptr_t)opaque;
> if (running) {
> - memory_global_dirty_log_do_stop(flags);
> -
> - if (vmstate_change) {
> - qemu_del_vm_change_state_handler(vmstate_change);
> - vmstate_change = NULL;
> - }
> + memory_global_dirty_log_stop_postponed_run();
> }
> }
>
> void memory_global_dirty_log_stop(unsigned int flags)
> {
> if (!runstate_is_running()) {
> + /* Postpone the dirty log stop, e.g., to when VM starts again */
> if (vmstate_change) {
> - return;
> + /* Batch with previous postponed flags */
> + postponed_stop_flags |= flags;
> + } else {
> + postponed_stop_flags = flags;
> + vmstate_change = qemu_add_vm_change_state_handler(
> + memory_vm_change_state_handler, NULL);
> }
> - vmstate_change = qemu_add_vm_change_state_handler(
> - memory_vm_change_state_handler,
> - (void *)(uintptr_t)flags);
> return;
> }
>
Queued, thanks.
Paolo
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2022-02-08 11:34 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-02-07 12:30 [PATCH v2] memory: Fix qemu crash on starting dirty log twice with stopped VM Peter Xu
2022-02-08 11:15 ` Paolo Bonzini
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).