qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* s390 migration crash
@ 2023-03-21 20:24 Dr. David Alan Gilbert
  2023-03-22  0:19 ` Peter Xu
  0 siblings, 1 reply; 8+ messages in thread
From: Dr. David Alan Gilbert @ 2023-03-21 20:24 UTC (permalink / raw)
  To: peterx, peter.maydell; +Cc: qemu-devel, quintela

Hi Peter's,
  Peter M pointed me to a seg in a migration test in CI; I can reproduce
it:
  * On an s390 host
  * only as part of a make check - running migration-test by itself
doesn't trigger for me.
  * It looks like it's postcopy preempt

(gdb) bt full
#0  iov_size (iov=iov@entry=0x2aa00e60670, iov_cnt=<optimized out>) at ../util/iov.c:88
        len = 13517923312037845750
        i = 17305
#1  0x000002aa004d068c in qemu_fflush (f=0x2aa00e58630) at ../migration/qemu-file.c:307
        local_error = 0x0
#2  0x000002aa004d0e04 in qemu_fflush (f=<optimized out>) at ../migration/qemu-file.c:297
#3  0x000002aa00613962 in postcopy_preempt_shutdown_file (s=s@entry=0x2aa00d1b4e0) at ../migration/ram.c:4657
#4  0x000002aa004e12b4 in migration_completion (s=0x2aa00d1b4e0) at ../migration/migration.c:3469
        ret = <optimized out>
        current_active_state = 5
        must_precopy = 0
        can_postcopy = 0
        in_postcopy = true
        pending_size = 0
        __func__ = "migration_iteration_run"
        iter_state = <optimized out>
        s = 0x2aa00d1b4e0
        thread = <optimized out>
        setup_start = <optimized out>
        thr_error = <optimized out>
        urgent = <optimized out>
#5  migration_iteration_run (s=0x2aa00d1b4e0) at ../migration/migration.c:3882
        must_precopy = 0
        can_postcopy = 0
        in_postcopy = true
        pending_size = 0
        __func__ = "migration_iteration_run"
        iter_state = <optimized out>
        s = 0x2aa00d1b4e0
        thread = <optimized out>
        setup_start = <optimized out>
        thr_error = <optimized out>
        urgent = <optimized out>
#6  migration_thread (opaque=opaque@entry=0x2aa00d1b4e0) at ../migration/migration.c:4124
        iter_state = <optimized out>
        s = 0x2aa00d1b4e0
--Type <RET> for more, q to quit, c to continue without paging--
        thread = <optimized out>
        setup_start = <optimized out>
        thr_error = <optimized out>
        urgent = <optimized out>
#7  0x000002aa00819b8c in qemu_thread_start (args=<optimized out>) at ../util/qemu-thread-posix.c:541
        __cancel_buf = 
            {__cancel_jmp_buf = {{__cancel_jmp_buf = {{__gregs = {4396782422080, 4393751543808, 4397299389454, 4396844235904, 2929182727824, 2929182933488, 4396843986792, 4397299389455, 33679382915066768, 33678512846981306}, __fpregs = {4396774031360, 8392704, 2929182933488, 0, 4396782422272, 2929172491858, 4396774031360, 1}}}, __mask_was_saved = 0}}, __pad = {0x3ffb4a77a60, 0x0, 0x0, 0x0}}
        __cancel_routine = 0x2aa00819bf0 <qemu_thread_atexit_notify>
        __not_first_call = <optimized out>
        start_routine = 0x2aa004e08f0 <migration_thread>
        arg = 0x2aa00d1b4e0
        r = <optimized out>
#8  0x000003ffb7b1e2e6 in start_thread () at /lib64/libc.so.6
#9  0x000003ffb7aafdbe in thread_start () at /lib64/libc.so.6

It looks like it's in the preempt test:

(gdb) where
#0  0x000003ffb17a0126 in __pthread_kill_implementation () from /lib64/libc.so.6
#1  0x000003ffb1750890 in raise () from /lib64/libc.so.6
#2  0x000003ffb172a340 in abort () from /lib64/libc.so.6
#3  0x000002aa0041c130 in qtest_check_status (s=<optimized out>) at ../tests/qtest/libqtest.c:194
#4  0x000003ffb1a3b5de in g_hook_list_invoke () from /lib64/libglib-2.0.so.0
#5  <signal handler called>
#6  0x000003ffb17a0126 in __pthread_kill_implementation () from /lib64/libc.so.6
#7  0x000003ffb1750890 in raise () from /lib64/libc.so.6
#8  0x000003ffb172a340 in abort () from /lib64/libc.so.6
#9  0x000002aa00420318 in qmp_fd_receive (fd=<optimized out>) at ../tests/qtest/libqmp.c:80
#10 0x000002aa0041d5ee in qtest_qmp_receive_dict (s=0x2aa01eb2700) at ../tests/qtest/libqtest.c:713
#11 qtest_qmp_receive (s=0x2aa01eb2700) at ../tests/qtest/libqtest.c:701
#12 qtest_vqmp (s=s@entry=0x2aa01eb2700, fmt=fmt@entry=0x2aa00487100 "{ 'execute': 'query-migrate' }", ap=ap@entry=0x3ffc247cc68)
    at ../tests/qtest/libqtest.c:765
#13 0x000002aa00413f1e in wait_command (who=who@entry=0x2aa01eb2700, command=command@entry=0x2aa00487100 "{ 'execute': 'query-migrate' }")
    at ../tests/qtest/migration-helpers.c:73
#14 0x000002aa00414078 in migrate_query (who=who@entry=0x2aa01eb2700) at ../tests/qtest/migration-helpers.c:139
#15 migrate_query_status (who=who@entry=0x2aa01eb2700) at ../tests/qtest/migration-helpers.c:161
#16 0x000002aa00414480 in check_migration_status (ungoals=0x0, goal=0x2aa00495c7e "completed", who=0x2aa01eb2700) at ../tests/qtest/migration-helpers.c:177
#17 wait_for_migration_status (who=0x2aa01eb2700, goal=<optimized out>, ungoals=0x0) at ../tests/qtest/migration-helpers.c:202
#18 0x000002aa0041300e in migrate_postcopy_complete (from=from@entry=0x2aa01eb2700, to=to@entry=0x2aa01eb3000, args=args@entry=0x3ffc247cf48)
    at ../tests/qtest/migration-test.c:1137
#19 0x000002aa004131a4 in test_postcopy_common (args=0x3ffc247cf48) at ../tests/qtest/migration-test.c:1162
#20 test_postcopy_preempt () at ../tests/qtest/migration-test.c:1178

Looking at the iov and file it's garbage; so it makes me think this is
something like a flush on a closed file.

Dave

-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: s390 migration crash
  2023-03-21 20:24 s390 migration crash Dr. David Alan Gilbert
@ 2023-03-22  0:19 ` Peter Xu
  2023-03-22 14:05   ` Dr. David Alan Gilbert
  2023-03-22 19:21   ` Daniel P. Berrangé
  0 siblings, 2 replies; 8+ messages in thread
From: Peter Xu @ 2023-03-22  0:19 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: peter.maydell, qemu-devel, quintela

[-- Attachment #1: Type: text/plain, Size: 6621 bytes --]

On Tue, Mar 21, 2023 at 08:24:37PM +0000, Dr. David Alan Gilbert wrote:
> Hi Peter's,
>   Peter M pointed me to a seg in a migration test in CI; I can reproduce
> it:
>   * On an s390 host

How easy to reproduce?

>   * only as part of a make check - running migration-test by itself
> doesn't trigger for me.
>   * It looks like it's postcopy preempt
> 
> (gdb) bt full
> #0  iov_size (iov=iov@entry=0x2aa00e60670, iov_cnt=<optimized out>) at ../util/iov.c:88
>         len = 13517923312037845750
>         i = 17305
> #1  0x000002aa004d068c in qemu_fflush (f=0x2aa00e58630) at ../migration/qemu-file.c:307
>         local_error = 0x0
> #2  0x000002aa004d0e04 in qemu_fflush (f=<optimized out>) at ../migration/qemu-file.c:297
> #3  0x000002aa00613962 in postcopy_preempt_shutdown_file (s=s@entry=0x2aa00d1b4e0) at ../migration/ram.c:4657
> #4  0x000002aa004e12b4 in migration_completion (s=0x2aa00d1b4e0) at ../migration/migration.c:3469
>         ret = <optimized out>
>         current_active_state = 5
>         must_precopy = 0
>         can_postcopy = 0
>         in_postcopy = true
>         pending_size = 0
>         __func__ = "migration_iteration_run"
>         iter_state = <optimized out>
>         s = 0x2aa00d1b4e0
>         thread = <optimized out>
>         setup_start = <optimized out>
>         thr_error = <optimized out>
>         urgent = <optimized out>
> #5  migration_iteration_run (s=0x2aa00d1b4e0) at ../migration/migration.c:3882
>         must_precopy = 0
>         can_postcopy = 0
>         in_postcopy = true
>         pending_size = 0
>         __func__ = "migration_iteration_run"
>         iter_state = <optimized out>
>         s = 0x2aa00d1b4e0
>         thread = <optimized out>
>         setup_start = <optimized out>
>         thr_error = <optimized out>
>         urgent = <optimized out>
> #6  migration_thread (opaque=opaque@entry=0x2aa00d1b4e0) at ../migration/migration.c:4124
>         iter_state = <optimized out>
>         s = 0x2aa00d1b4e0
> --Type <RET> for more, q to quit, c to continue without paging--
>         thread = <optimized out>
>         setup_start = <optimized out>
>         thr_error = <optimized out>
>         urgent = <optimized out>
> #7  0x000002aa00819b8c in qemu_thread_start (args=<optimized out>) at ../util/qemu-thread-posix.c:541
>         __cancel_buf = 
>             {__cancel_jmp_buf = {{__cancel_jmp_buf = {{__gregs = {4396782422080, 4393751543808, 4397299389454, 4396844235904, 2929182727824, 2929182933488, 4396843986792, 4397299389455, 33679382915066768, 33678512846981306}, __fpregs = {4396774031360, 8392704, 2929182933488, 0, 4396782422272, 2929172491858, 4396774031360, 1}}}, __mask_was_saved = 0}}, __pad = {0x3ffb4a77a60, 0x0, 0x0, 0x0}}
>         __cancel_routine = 0x2aa00819bf0 <qemu_thread_atexit_notify>
>         __not_first_call = <optimized out>
>         start_routine = 0x2aa004e08f0 <migration_thread>
>         arg = 0x2aa00d1b4e0
>         r = <optimized out>
> #8  0x000003ffb7b1e2e6 in start_thread () at /lib64/libc.so.6
> #9  0x000003ffb7aafdbe in thread_start () at /lib64/libc.so.6
> 
> It looks like it's in the preempt test:
> 
> (gdb) where
> #0  0x000003ffb17a0126 in __pthread_kill_implementation () from /lib64/libc.so.6
> #1  0x000003ffb1750890 in raise () from /lib64/libc.so.6
> #2  0x000003ffb172a340 in abort () from /lib64/libc.so.6
> #3  0x000002aa0041c130 in qtest_check_status (s=<optimized out>) at ../tests/qtest/libqtest.c:194
> #4  0x000003ffb1a3b5de in g_hook_list_invoke () from /lib64/libglib-2.0.so.0
> #5  <signal handler called>
> #6  0x000003ffb17a0126 in __pthread_kill_implementation () from /lib64/libc.so.6
> #7  0x000003ffb1750890 in raise () from /lib64/libc.so.6
> #8  0x000003ffb172a340 in abort () from /lib64/libc.so.6
> #9  0x000002aa00420318 in qmp_fd_receive (fd=<optimized out>) at ../tests/qtest/libqmp.c:80
> #10 0x000002aa0041d5ee in qtest_qmp_receive_dict (s=0x2aa01eb2700) at ../tests/qtest/libqtest.c:713
> #11 qtest_qmp_receive (s=0x2aa01eb2700) at ../tests/qtest/libqtest.c:701
> #12 qtest_vqmp (s=s@entry=0x2aa01eb2700, fmt=fmt@entry=0x2aa00487100 "{ 'execute': 'query-migrate' }", ap=ap@entry=0x3ffc247cc68)
>     at ../tests/qtest/libqtest.c:765
> #13 0x000002aa00413f1e in wait_command (who=who@entry=0x2aa01eb2700, command=command@entry=0x2aa00487100 "{ 'execute': 'query-migrate' }")
>     at ../tests/qtest/migration-helpers.c:73
> #14 0x000002aa00414078 in migrate_query (who=who@entry=0x2aa01eb2700) at ../tests/qtest/migration-helpers.c:139
> #15 migrate_query_status (who=who@entry=0x2aa01eb2700) at ../tests/qtest/migration-helpers.c:161
> #16 0x000002aa00414480 in check_migration_status (ungoals=0x0, goal=0x2aa00495c7e "completed", who=0x2aa01eb2700) at ../tests/qtest/migration-helpers.c:177
> #17 wait_for_migration_status (who=0x2aa01eb2700, goal=<optimized out>, ungoals=0x0) at ../tests/qtest/migration-helpers.c:202
> #18 0x000002aa0041300e in migrate_postcopy_complete (from=from@entry=0x2aa01eb2700, to=to@entry=0x2aa01eb3000, args=args@entry=0x3ffc247cf48)
>     at ../tests/qtest/migration-test.c:1137
> #19 0x000002aa004131a4 in test_postcopy_common (args=0x3ffc247cf48) at ../tests/qtest/migration-test.c:1162
> #20 test_postcopy_preempt () at ../tests/qtest/migration-test.c:1178
> 
> Looking at the iov and file it's garbage; so it makes me think this is
> something like a flush on a closed file.

I didn't figure out how that could be closed, but I think there's indeed a
possible race that the qemufile can be accessed by both the return path
thread and the migration thread concurrently, while qemufile is not thread
safe on that.

What postcopy_preempt_shutdown_file() does was: the src uses this EOS to
kick the dest QEMU preempt thread out of the migration and shut it off.
After some thought I think this is unnecessary complexity, since postcopy
should end at the point where dest received all the data, then it sends a
SHUT to src.  So potentially it's not good to have dest relying on anything
from src to shutdown anything (the preempt thread here) because it's the
dest qemu that makes the final decision to finish.  Ideally the preempt
thread on dest should be able to shutdown itself.

The trick here is preempt thread will block at read() (aka, recvmsg()) at
the channel at that time and the only way to kick it out from that is a
shutdown() on dest.  I attached a patch did it.  I'm not 100% sure whether
it'll already resolve our problem but worth trying.  This also made me
notice we forgot to enable SHUTDOWN feature on tls server when I was
running the patch 1 with qtest, so two patches needed.

-- 
Peter Xu

[-- Attachment #2: 0001-io-tls-Inherit-QIO_CHANNEL_FEATURE_SHUTDOWN-on-serve.patch --]
[-- Type: text/plain, Size: 1073 bytes --]

From 0e317fa78e9671c119f6be78a0e0a36201517dc2 Mon Sep 17 00:00:00 2001
From: Peter Xu <peterx@redhat.com>
Date: Tue, 21 Mar 2023 19:58:42 -0400
Subject: [PATCH 1/2] io: tls: Inherit QIO_CHANNEL_FEATURE_SHUTDOWN on server
 side

TLS iochannel will inherit io_shutdown() from the master ioc, however we
missed to do that on the server side.

This will e.g. allow qemu_file_shutdown() to work on dest QEMU too for
migration.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 io/channel-tls.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/io/channel-tls.c b/io/channel-tls.c
index 5a7a3d48d6..9805dd0a3f 100644
--- a/io/channel-tls.c
+++ b/io/channel-tls.c
@@ -74,6 +74,9 @@ qio_channel_tls_new_server(QIOChannel *master,
     ioc = QIO_CHANNEL_TLS(object_new(TYPE_QIO_CHANNEL_TLS));
 
     ioc->master = master;
+    if (qio_channel_has_feature(master, QIO_CHANNEL_FEATURE_SHUTDOWN)) {
+        qio_channel_set_feature(QIO_CHANNEL(ioc), QIO_CHANNEL_FEATURE_SHUTDOWN);
+    }
     object_ref(OBJECT(master));
 
     ioc->session = qcrypto_tls_session_new(
-- 
2.39.1


[-- Attachment #3: 0002-migration-Fix-potential-race-on-postcopy_qemufile_sr.patch --]
[-- Type: text/plain, Size: 7821 bytes --]

From 73b267c17e689d0dafbde1d93c0125b81a43184e Mon Sep 17 00:00:00 2001
From: Peter Xu <peterx@redhat.com>
Date: Tue, 21 Mar 2023 19:11:22 -0400
Subject: [PATCH 2/2] migration: Fix potential race on postcopy_qemufile_src

postcopy_qemufile_src object should be owned by one thread, either the main
thread (e.g. when at the beginning, or at the end of migration), or by the
return path thread (when during a preempt enabled postcopy migration).  If
that's not the case the access to the object might be racy.

postcopy_preempt_shutdown_file() can be potentially racy, because it's
called at the end phase of migration on the main thread, however during
which the return path thread hasn't yet been recycled; the recycle happens
in await_return_path_close_on_source() which is after this point.

It means, logically it's posslbe the main thread and the return path thread
are both operating on the same qemufile.  While I don't think qemufile is
thread safe at all.

postcopy_preempt_shutdown_file() used to be needed because that's where we
send EOS to dest so that dest can safely shutdown the preempt thread.

To avoid the possible race, remove this only place that a race can happen.
Instead we figure out another way to safely close the preempt thread on
dest.

The core idea during postcopy on deciding "when to stop" is that dest will
send a postcopy SHUT message to src, telling src that all data is there.
Hence to shut the dest preempt thread maybe better to do it directly on
dest node.

This patch proposed such a way that we change postcopy_prio_thread_created
into PreemptThreadStatus, so that we kick the preempt thread on dest qemu
by a sequence of:

  mis->preempt_thread_status = PREEMPT_THREAD_QUIT;
  qemu_file_shutdown(mis->postcopy_qemufile_dst);

While here shutdown() is probably so far the easiest way to kick preempt
thread from a blocked qemu_get_be64().  Then it reads preempt_thread_status
to make sure it's not a network failure but a willingness to quit the
thread.

We could have avoided that extra status but just rely on migration status.
The problem is postcopy_ram_incoming_cleanup() is just called early enough
so we're still during POSTCOPY_ACTIVE no matter what.. So just make it
simple to have the status introduced.

Fixes: 36f62f11e4 ("migration: Postcopy preemption preparation on channel creation")
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/migration.c    |  7 -------
 migration/migration.h    | 13 ++++++++++++-
 migration/postcopy-ram.c | 20 +++++++++++++++-----
 migration/ram.c          |  6 ------
 migration/ram.h          |  1 -
 5 files changed, 27 insertions(+), 20 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index ae2025d9d8..bbc36100de 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -3459,16 +3459,9 @@ static void migration_completion(MigrationState *s)
         }
     } else if (s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
         trace_migration_completion_postcopy_end();
-
         qemu_mutex_lock_iothread();
         qemu_savevm_state_complete_postcopy(s->to_dst_file);
         qemu_mutex_unlock_iothread();
-
-        /* Shutdown the postcopy fast path thread */
-        if (migrate_postcopy_preempt()) {
-            postcopy_preempt_shutdown_file(s);
-        }
-
         trace_migration_completion_postcopy_end_after_complete();
     } else {
         goto fail;
diff --git a/migration/migration.h b/migration/migration.h
index 2da2f8a164..7228163cc8 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -65,6 +65,12 @@ typedef struct {
     bool all_zero;
 } PostcopyTmpPage;
 
+typedef enum {
+    PREEMPT_THREAD_NONE = 0,
+    PREEMPT_THREAD_CREATED,
+    PREEMPT_THREAD_QUIT,
+} PreemptThreadStatus;
+
 /* State for the incoming migration */
 struct MigrationIncomingState {
     QEMUFile *from_src_file;
@@ -124,7 +130,12 @@ struct MigrationIncomingState {
     QemuSemaphore postcopy_qemufile_dst_done;
     /* Postcopy priority thread is used to receive postcopy requested pages */
     QemuThread postcopy_prio_thread;
-    bool postcopy_prio_thread_created;
+    /*
+     * Always set by the main vm load thread only, but can be read by the
+     * postcopy preempt thread.  "volatile" makes sure all reads will be
+     * uptodate across cores.
+     */
+    volatile PreemptThreadStatus preempt_thread_status;
     /*
      * Used to sync between the ram load main thread and the fast ram load
      * thread.  It protects postcopy_qemufile_dst, which is the postcopy
diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
index 41c0713650..263bab75ec 100644
--- a/migration/postcopy-ram.c
+++ b/migration/postcopy-ram.c
@@ -568,9 +568,14 @@ int postcopy_ram_incoming_cleanup(MigrationIncomingState *mis)
 {
     trace_postcopy_ram_incoming_cleanup_entry();
 
-    if (mis->postcopy_prio_thread_created) {
+    if (mis->preempt_thread_status == PREEMPT_THREAD_CREATED) {
+        /* Notify the fast load thread to quit */
+        mis->preempt_thread_status = PREEMPT_THREAD_QUIT;
+        if (mis->postcopy_qemufile_dst) {
+            qemu_file_shutdown(mis->postcopy_qemufile_dst);
+        }
         qemu_thread_join(&mis->postcopy_prio_thread);
-        mis->postcopy_prio_thread_created = false;
+        mis->preempt_thread_status = PREEMPT_THREAD_NONE;
     }
 
     if (mis->have_fault_thread) {
@@ -1203,7 +1208,7 @@ int postcopy_ram_incoming_setup(MigrationIncomingState *mis)
          */
         postcopy_thread_create(mis, &mis->postcopy_prio_thread, "fault-fast",
                                postcopy_preempt_thread, QEMU_THREAD_JOINABLE);
-        mis->postcopy_prio_thread_created = true;
+        mis->preempt_thread_status = PREEMPT_THREAD_CREATED;
     }
 
     trace_postcopy_ram_enable_notify();
@@ -1652,6 +1657,11 @@ static void postcopy_pause_ram_fast_load(MigrationIncomingState *mis)
     trace_postcopy_pause_fast_load_continued();
 }
 
+static bool preempt_thread_should_run(MigrationIncomingState *mis)
+{
+    return mis->preempt_thread_status != PREEMPT_THREAD_QUIT;
+}
+
 void *postcopy_preempt_thread(void *opaque)
 {
     MigrationIncomingState *mis = opaque;
@@ -1671,11 +1681,11 @@ void *postcopy_preempt_thread(void *opaque)
 
     /* Sending RAM_SAVE_FLAG_EOS to terminate this thread */
     qemu_mutex_lock(&mis->postcopy_prio_thread_mutex);
-    while (1) {
+    while (preempt_thread_should_run(mis)) {
         ret = ram_load_postcopy(mis->postcopy_qemufile_dst,
                                 RAM_CHANNEL_POSTCOPY);
         /* If error happened, go into recovery routine */
-        if (ret) {
+        if (ret && preempt_thread_should_run(mis)) {
             postcopy_pause_ram_fast_load(mis);
         } else {
             /* We're done */
diff --git a/migration/ram.c b/migration/ram.c
index 96e8a19a58..8c316ffed2 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -4651,12 +4651,6 @@ static int ram_resume_prepare(MigrationState *s, void *opaque)
     return 0;
 }
 
-void postcopy_preempt_shutdown_file(MigrationState *s)
-{
-    qemu_put_be64(s->postcopy_qemufile_src, RAM_SAVE_FLAG_EOS);
-    qemu_fflush(s->postcopy_qemufile_src);
-}
-
 static SaveVMHandlers savevm_ram_handlers = {
     .save_setup = ram_save_setup,
     .save_live_iterate = ram_save_iterate,
diff --git a/migration/ram.h b/migration/ram.h
index 81cbb0947c..93b071a1a7 100644
--- a/migration/ram.h
+++ b/migration/ram.h
@@ -96,7 +96,6 @@ int64_t ramblock_recv_bitmap_send(QEMUFile *file,
                                   const char *block_name);
 int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *rb);
 bool ramblock_page_is_discarded(RAMBlock *rb, ram_addr_t start);
-void postcopy_preempt_shutdown_file(MigrationState *s);
 void *postcopy_preempt_thread(void *opaque);
 
 /* ram cache */
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: s390 migration crash
  2023-03-22  0:19 ` Peter Xu
@ 2023-03-22 14:05   ` Dr. David Alan Gilbert
  2023-03-22 15:02     ` Peter Xu
  2023-03-22 19:21   ` Daniel P. Berrangé
  1 sibling, 1 reply; 8+ messages in thread
From: Dr. David Alan Gilbert @ 2023-03-22 14:05 UTC (permalink / raw)
  To: Peter Xu; +Cc: peter.maydell, qemu-devel, quintela

* Peter Xu (peterx@redhat.com) wrote:
> On Tue, Mar 21, 2023 at 08:24:37PM +0000, Dr. David Alan Gilbert wrote:
> > Hi Peter's,
> >   Peter M pointed me to a seg in a migration test in CI; I can reproduce
> > it:
> >   * On an s390 host
> 
> How easy to reproduce?

Pretty much every time when run as:
make check -j 4

> >   * only as part of a make check - running migration-test by itself
> > doesn't trigger for me.
> >   * It looks like it's postcopy preempt
> > 
> > (gdb) bt full
> > #0  iov_size (iov=iov@entry=0x2aa00e60670, iov_cnt=<optimized out>) at ../util/iov.c:88
> >         len = 13517923312037845750
> >         i = 17305
> > #1  0x000002aa004d068c in qemu_fflush (f=0x2aa00e58630) at ../migration/qemu-file.c:307
> >         local_error = 0x0
> > #2  0x000002aa004d0e04 in qemu_fflush (f=<optimized out>) at ../migration/qemu-file.c:297
> > #3  0x000002aa00613962 in postcopy_preempt_shutdown_file (s=s@entry=0x2aa00d1b4e0) at ../migration/ram.c:4657
> > #4  0x000002aa004e12b4 in migration_completion (s=0x2aa00d1b4e0) at ../migration/migration.c:3469
> >         ret = <optimized out>
> >         current_active_state = 5
> >         must_precopy = 0
> >         can_postcopy = 0
> >         in_postcopy = true
> >         pending_size = 0
> >         __func__ = "migration_iteration_run"
> >         iter_state = <optimized out>
> >         s = 0x2aa00d1b4e0
> >         thread = <optimized out>
> >         setup_start = <optimized out>
> >         thr_error = <optimized out>
> >         urgent = <optimized out>
> > #5  migration_iteration_run (s=0x2aa00d1b4e0) at ../migration/migration.c:3882
> >         must_precopy = 0
> >         can_postcopy = 0
> >         in_postcopy = true
> >         pending_size = 0
> >         __func__ = "migration_iteration_run"
> >         iter_state = <optimized out>
> >         s = 0x2aa00d1b4e0
> >         thread = <optimized out>
> >         setup_start = <optimized out>
> >         thr_error = <optimized out>
> >         urgent = <optimized out>
> > #6  migration_thread (opaque=opaque@entry=0x2aa00d1b4e0) at ../migration/migration.c:4124
> >         iter_state = <optimized out>
> >         s = 0x2aa00d1b4e0
> > --Type <RET> for more, q to quit, c to continue without paging--
> >         thread = <optimized out>
> >         setup_start = <optimized out>
> >         thr_error = <optimized out>
> >         urgent = <optimized out>
> > #7  0x000002aa00819b8c in qemu_thread_start (args=<optimized out>) at ../util/qemu-thread-posix.c:541
> >         __cancel_buf = 
> >             {__cancel_jmp_buf = {{__cancel_jmp_buf = {{__gregs = {4396782422080, 4393751543808, 4397299389454, 4396844235904, 2929182727824, 2929182933488, 4396843986792, 4397299389455, 33679382915066768, 33678512846981306}, __fpregs = {4396774031360, 8392704, 2929182933488, 0, 4396782422272, 2929172491858, 4396774031360, 1}}}, __mask_was_saved = 0}}, __pad = {0x3ffb4a77a60, 0x0, 0x0, 0x0}}
> >         __cancel_routine = 0x2aa00819bf0 <qemu_thread_atexit_notify>
> >         __not_first_call = <optimized out>
> >         start_routine = 0x2aa004e08f0 <migration_thread>
> >         arg = 0x2aa00d1b4e0
> >         r = <optimized out>
> > #8  0x000003ffb7b1e2e6 in start_thread () at /lib64/libc.so.6
> > #9  0x000003ffb7aafdbe in thread_start () at /lib64/libc.so.6
> > 
> > It looks like it's in the preempt test:
> > 
> > (gdb) where
> > #0  0x000003ffb17a0126 in __pthread_kill_implementation () from /lib64/libc.so.6
> > #1  0x000003ffb1750890 in raise () from /lib64/libc.so.6
> > #2  0x000003ffb172a340 in abort () from /lib64/libc.so.6
> > #3  0x000002aa0041c130 in qtest_check_status (s=<optimized out>) at ../tests/qtest/libqtest.c:194
> > #4  0x000003ffb1a3b5de in g_hook_list_invoke () from /lib64/libglib-2.0.so.0
> > #5  <signal handler called>
> > #6  0x000003ffb17a0126 in __pthread_kill_implementation () from /lib64/libc.so.6
> > #7  0x000003ffb1750890 in raise () from /lib64/libc.so.6
> > #8  0x000003ffb172a340 in abort () from /lib64/libc.so.6
> > #9  0x000002aa00420318 in qmp_fd_receive (fd=<optimized out>) at ../tests/qtest/libqmp.c:80
> > #10 0x000002aa0041d5ee in qtest_qmp_receive_dict (s=0x2aa01eb2700) at ../tests/qtest/libqtest.c:713
> > #11 qtest_qmp_receive (s=0x2aa01eb2700) at ../tests/qtest/libqtest.c:701
> > #12 qtest_vqmp (s=s@entry=0x2aa01eb2700, fmt=fmt@entry=0x2aa00487100 "{ 'execute': 'query-migrate' }", ap=ap@entry=0x3ffc247cc68)
> >     at ../tests/qtest/libqtest.c:765
> > #13 0x000002aa00413f1e in wait_command (who=who@entry=0x2aa01eb2700, command=command@entry=0x2aa00487100 "{ 'execute': 'query-migrate' }")
> >     at ../tests/qtest/migration-helpers.c:73
> > #14 0x000002aa00414078 in migrate_query (who=who@entry=0x2aa01eb2700) at ../tests/qtest/migration-helpers.c:139
> > #15 migrate_query_status (who=who@entry=0x2aa01eb2700) at ../tests/qtest/migration-helpers.c:161
> > #16 0x000002aa00414480 in check_migration_status (ungoals=0x0, goal=0x2aa00495c7e "completed", who=0x2aa01eb2700) at ../tests/qtest/migration-helpers.c:177
> > #17 wait_for_migration_status (who=0x2aa01eb2700, goal=<optimized out>, ungoals=0x0) at ../tests/qtest/migration-helpers.c:202
> > #18 0x000002aa0041300e in migrate_postcopy_complete (from=from@entry=0x2aa01eb2700, to=to@entry=0x2aa01eb3000, args=args@entry=0x3ffc247cf48)
> >     at ../tests/qtest/migration-test.c:1137
> > #19 0x000002aa004131a4 in test_postcopy_common (args=0x3ffc247cf48) at ../tests/qtest/migration-test.c:1162
> > #20 test_postcopy_preempt () at ../tests/qtest/migration-test.c:1178
> > 
> > Looking at the iov and file it's garbage; so it makes me think this is
> > something like a flush on a closed file.
> 
> I didn't figure out how that could be closed, but I think there's indeed a
> possible race that the qemufile can be accessed by both the return path
> thread and the migration thread concurrently, while qemufile is not thread
> safe on that.
> 
> What postcopy_preempt_shutdown_file() does was: the src uses this EOS to
> kick the dest QEMU preempt thread out of the migration and shut it off.
> After some thought I think this is unnecessary complexity, since postcopy
> should end at the point where dest received all the data, then it sends a
> SHUT to src.  So potentially it's not good to have dest relying on anything
> from src to shutdown anything (the preempt thread here) because it's the
> dest qemu that makes the final decision to finish.  Ideally the preempt
> thread on dest should be able to shutdown itself.
> 
> The trick here is preempt thread will block at read() (aka, recvmsg()) at
> the channel at that time and the only way to kick it out from that is a
> shutdown() on dest.  I attached a patch did it.  I'm not 100% sure whether
> it'll already resolve our problem but worth trying.  This also made me
> notice we forgot to enable SHUTDOWN feature on tls server when I was
> running the patch 1 with qtest, so two patches needed.

Well, it seems to fix it:

Tested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

Will this break interaction with an old qemu that's still waiting for
the EOS?

Dave

> -- 
> Peter Xu

> From 0e317fa78e9671c119f6be78a0e0a36201517dc2 Mon Sep 17 00:00:00 2001
> From: Peter Xu <peterx@redhat.com>
> Date: Tue, 21 Mar 2023 19:58:42 -0400
> Subject: [PATCH 1/2] io: tls: Inherit QIO_CHANNEL_FEATURE_SHUTDOWN on server
>  side
> 
> TLS iochannel will inherit io_shutdown() from the master ioc, however we
> missed to do that on the server side.
> 
> This will e.g. allow qemu_file_shutdown() to work on dest QEMU too for
> migration.
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  io/channel-tls.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/io/channel-tls.c b/io/channel-tls.c
> index 5a7a3d48d6..9805dd0a3f 100644
> --- a/io/channel-tls.c
> +++ b/io/channel-tls.c
> @@ -74,6 +74,9 @@ qio_channel_tls_new_server(QIOChannel *master,
>      ioc = QIO_CHANNEL_TLS(object_new(TYPE_QIO_CHANNEL_TLS));
>  
>      ioc->master = master;
> +    if (qio_channel_has_feature(master, QIO_CHANNEL_FEATURE_SHUTDOWN)) {
> +        qio_channel_set_feature(QIO_CHANNEL(ioc), QIO_CHANNEL_FEATURE_SHUTDOWN);
> +    }
>      object_ref(OBJECT(master));
>  
>      ioc->session = qcrypto_tls_session_new(
> -- 
> 2.39.1
> 

> From 73b267c17e689d0dafbde1d93c0125b81a43184e Mon Sep 17 00:00:00 2001
> From: Peter Xu <peterx@redhat.com>
> Date: Tue, 21 Mar 2023 19:11:22 -0400
> Subject: [PATCH 2/2] migration: Fix potential race on postcopy_qemufile_src
> 
> postcopy_qemufile_src object should be owned by one thread, either the main
> thread (e.g. when at the beginning, or at the end of migration), or by the
> return path thread (when during a preempt enabled postcopy migration).  If
> that's not the case the access to the object might be racy.
> 
> postcopy_preempt_shutdown_file() can be potentially racy, because it's
> called at the end phase of migration on the main thread, however during
> which the return path thread hasn't yet been recycled; the recycle happens
> in await_return_path_close_on_source() which is after this point.
> 
> It means, logically it's posslbe the main thread and the return path thread
> are both operating on the same qemufile.  While I don't think qemufile is
> thread safe at all.
> 
> postcopy_preempt_shutdown_file() used to be needed because that's where we
> send EOS to dest so that dest can safely shutdown the preempt thread.
> 
> To avoid the possible race, remove this only place that a race can happen.
> Instead we figure out another way to safely close the preempt thread on
> dest.
> 
> The core idea during postcopy on deciding "when to stop" is that dest will
> send a postcopy SHUT message to src, telling src that all data is there.
> Hence to shut the dest preempt thread maybe better to do it directly on
> dest node.
> 
> This patch proposed such a way that we change postcopy_prio_thread_created
> into PreemptThreadStatus, so that we kick the preempt thread on dest qemu
> by a sequence of:
> 
>   mis->preempt_thread_status = PREEMPT_THREAD_QUIT;
>   qemu_file_shutdown(mis->postcopy_qemufile_dst);
> 
> While here shutdown() is probably so far the easiest way to kick preempt
> thread from a blocked qemu_get_be64().  Then it reads preempt_thread_status
> to make sure it's not a network failure but a willingness to quit the
> thread.
> 
> We could have avoided that extra status but just rely on migration status.
> The problem is postcopy_ram_incoming_cleanup() is just called early enough
> so we're still during POSTCOPY_ACTIVE no matter what.. So just make it
> simple to have the status introduced.
> 
> Fixes: 36f62f11e4 ("migration: Postcopy preemption preparation on channel creation")
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  migration/migration.c    |  7 -------
>  migration/migration.h    | 13 ++++++++++++-
>  migration/postcopy-ram.c | 20 +++++++++++++++-----
>  migration/ram.c          |  6 ------
>  migration/ram.h          |  1 -
>  5 files changed, 27 insertions(+), 20 deletions(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index ae2025d9d8..bbc36100de 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -3459,16 +3459,9 @@ static void migration_completion(MigrationState *s)
>          }
>      } else if (s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
>          trace_migration_completion_postcopy_end();
> -
>          qemu_mutex_lock_iothread();
>          qemu_savevm_state_complete_postcopy(s->to_dst_file);
>          qemu_mutex_unlock_iothread();
> -
> -        /* Shutdown the postcopy fast path thread */
> -        if (migrate_postcopy_preempt()) {
> -            postcopy_preempt_shutdown_file(s);
> -        }
> -
>          trace_migration_completion_postcopy_end_after_complete();
>      } else {
>          goto fail;
> diff --git a/migration/migration.h b/migration/migration.h
> index 2da2f8a164..7228163cc8 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -65,6 +65,12 @@ typedef struct {
>      bool all_zero;
>  } PostcopyTmpPage;
>  
> +typedef enum {
> +    PREEMPT_THREAD_NONE = 0,
> +    PREEMPT_THREAD_CREATED,
> +    PREEMPT_THREAD_QUIT,
> +} PreemptThreadStatus;
> +
>  /* State for the incoming migration */
>  struct MigrationIncomingState {
>      QEMUFile *from_src_file;
> @@ -124,7 +130,12 @@ struct MigrationIncomingState {
>      QemuSemaphore postcopy_qemufile_dst_done;
>      /* Postcopy priority thread is used to receive postcopy requested pages */
>      QemuThread postcopy_prio_thread;
> -    bool postcopy_prio_thread_created;
> +    /*
> +     * Always set by the main vm load thread only, but can be read by the
> +     * postcopy preempt thread.  "volatile" makes sure all reads will be
> +     * uptodate across cores.
> +     */
> +    volatile PreemptThreadStatus preempt_thread_status;
>      /*
>       * Used to sync between the ram load main thread and the fast ram load
>       * thread.  It protects postcopy_qemufile_dst, which is the postcopy
> diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
> index 41c0713650..263bab75ec 100644
> --- a/migration/postcopy-ram.c
> +++ b/migration/postcopy-ram.c
> @@ -568,9 +568,14 @@ int postcopy_ram_incoming_cleanup(MigrationIncomingState *mis)
>  {
>      trace_postcopy_ram_incoming_cleanup_entry();
>  
> -    if (mis->postcopy_prio_thread_created) {
> +    if (mis->preempt_thread_status == PREEMPT_THREAD_CREATED) {
> +        /* Notify the fast load thread to quit */
> +        mis->preempt_thread_status = PREEMPT_THREAD_QUIT;
> +        if (mis->postcopy_qemufile_dst) {
> +            qemu_file_shutdown(mis->postcopy_qemufile_dst);
> +        }
>          qemu_thread_join(&mis->postcopy_prio_thread);
> -        mis->postcopy_prio_thread_created = false;
> +        mis->preempt_thread_status = PREEMPT_THREAD_NONE;
>      }
>  
>      if (mis->have_fault_thread) {
> @@ -1203,7 +1208,7 @@ int postcopy_ram_incoming_setup(MigrationIncomingState *mis)
>           */
>          postcopy_thread_create(mis, &mis->postcopy_prio_thread, "fault-fast",
>                                 postcopy_preempt_thread, QEMU_THREAD_JOINABLE);
> -        mis->postcopy_prio_thread_created = true;
> +        mis->preempt_thread_status = PREEMPT_THREAD_CREATED;
>      }
>  
>      trace_postcopy_ram_enable_notify();
> @@ -1652,6 +1657,11 @@ static void postcopy_pause_ram_fast_load(MigrationIncomingState *mis)
>      trace_postcopy_pause_fast_load_continued();
>  }
>  
> +static bool preempt_thread_should_run(MigrationIncomingState *mis)
> +{
> +    return mis->preempt_thread_status != PREEMPT_THREAD_QUIT;
> +}
> +
>  void *postcopy_preempt_thread(void *opaque)
>  {
>      MigrationIncomingState *mis = opaque;
> @@ -1671,11 +1681,11 @@ void *postcopy_preempt_thread(void *opaque)
>  
>      /* Sending RAM_SAVE_FLAG_EOS to terminate this thread */
>      qemu_mutex_lock(&mis->postcopy_prio_thread_mutex);
> -    while (1) {
> +    while (preempt_thread_should_run(mis)) {
>          ret = ram_load_postcopy(mis->postcopy_qemufile_dst,
>                                  RAM_CHANNEL_POSTCOPY);
>          /* If error happened, go into recovery routine */
> -        if (ret) {
> +        if (ret && preempt_thread_should_run(mis)) {
>              postcopy_pause_ram_fast_load(mis);
>          } else {
>              /* We're done */
> diff --git a/migration/ram.c b/migration/ram.c
> index 96e8a19a58..8c316ffed2 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -4651,12 +4651,6 @@ static int ram_resume_prepare(MigrationState *s, void *opaque)
>      return 0;
>  }
>  
> -void postcopy_preempt_shutdown_file(MigrationState *s)
> -{
> -    qemu_put_be64(s->postcopy_qemufile_src, RAM_SAVE_FLAG_EOS);
> -    qemu_fflush(s->postcopy_qemufile_src);
> -}
> -
>  static SaveVMHandlers savevm_ram_handlers = {
>      .save_setup = ram_save_setup,
>      .save_live_iterate = ram_save_iterate,
> diff --git a/migration/ram.h b/migration/ram.h
> index 81cbb0947c..93b071a1a7 100644
> --- a/migration/ram.h
> +++ b/migration/ram.h
> @@ -96,7 +96,6 @@ int64_t ramblock_recv_bitmap_send(QEMUFile *file,
>                                    const char *block_name);
>  int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *rb);
>  bool ramblock_page_is_discarded(RAMBlock *rb, ram_addr_t start);
> -void postcopy_preempt_shutdown_file(MigrationState *s);
>  void *postcopy_preempt_thread(void *opaque);
>  
>  /* ram cache */
> -- 
> 2.39.1
> 

-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: s390 migration crash
  2023-03-22 14:05   ` Dr. David Alan Gilbert
@ 2023-03-22 15:02     ` Peter Xu
  2023-03-22 18:13       ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 8+ messages in thread
From: Peter Xu @ 2023-03-22 15:02 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: peter.maydell, qemu-devel, quintela

On Wed, Mar 22, 2023 at 02:05:06PM +0000, Dr. David Alan Gilbert wrote:
> * Peter Xu (peterx@redhat.com) wrote:
> > On Tue, Mar 21, 2023 at 08:24:37PM +0000, Dr. David Alan Gilbert wrote:
> > > Hi Peter's,
> > >   Peter M pointed me to a seg in a migration test in CI; I can reproduce
> > > it:
> > >   * On an s390 host
> > 
> > How easy to reproduce?
> 
> Pretty much every time when run as:
> make check -j 4
> 
> > >   * only as part of a make check - running migration-test by itself
> > > doesn't trigger for me.
> > >   * It looks like it's postcopy preempt
> > > 
> > > (gdb) bt full
> > > #0  iov_size (iov=iov@entry=0x2aa00e60670, iov_cnt=<optimized out>) at ../util/iov.c:88
> > >         len = 13517923312037845750
> > >         i = 17305
> > > #1  0x000002aa004d068c in qemu_fflush (f=0x2aa00e58630) at ../migration/qemu-file.c:307
> > >         local_error = 0x0
> > > #2  0x000002aa004d0e04 in qemu_fflush (f=<optimized out>) at ../migration/qemu-file.c:297
> > > #3  0x000002aa00613962 in postcopy_preempt_shutdown_file (s=s@entry=0x2aa00d1b4e0) at ../migration/ram.c:4657
> > > #4  0x000002aa004e12b4 in migration_completion (s=0x2aa00d1b4e0) at ../migration/migration.c:3469
> > >         ret = <optimized out>
> > >         current_active_state = 5
> > >         must_precopy = 0
> > >         can_postcopy = 0
> > >         in_postcopy = true
> > >         pending_size = 0
> > >         __func__ = "migration_iteration_run"
> > >         iter_state = <optimized out>
> > >         s = 0x2aa00d1b4e0
> > >         thread = <optimized out>
> > >         setup_start = <optimized out>
> > >         thr_error = <optimized out>
> > >         urgent = <optimized out>
> > > #5  migration_iteration_run (s=0x2aa00d1b4e0) at ../migration/migration.c:3882
> > >         must_precopy = 0
> > >         can_postcopy = 0
> > >         in_postcopy = true
> > >         pending_size = 0
> > >         __func__ = "migration_iteration_run"
> > >         iter_state = <optimized out>
> > >         s = 0x2aa00d1b4e0
> > >         thread = <optimized out>
> > >         setup_start = <optimized out>
> > >         thr_error = <optimized out>
> > >         urgent = <optimized out>
> > > #6  migration_thread (opaque=opaque@entry=0x2aa00d1b4e0) at ../migration/migration.c:4124
> > >         iter_state = <optimized out>
> > >         s = 0x2aa00d1b4e0
> > > --Type <RET> for more, q to quit, c to continue without paging--
> > >         thread = <optimized out>
> > >         setup_start = <optimized out>
> > >         thr_error = <optimized out>
> > >         urgent = <optimized out>
> > > #7  0x000002aa00819b8c in qemu_thread_start (args=<optimized out>) at ../util/qemu-thread-posix.c:541
> > >         __cancel_buf = 
> > >             {__cancel_jmp_buf = {{__cancel_jmp_buf = {{__gregs = {4396782422080, 4393751543808, 4397299389454, 4396844235904, 2929182727824, 2929182933488, 4396843986792, 4397299389455, 33679382915066768, 33678512846981306}, __fpregs = {4396774031360, 8392704, 2929182933488, 0, 4396782422272, 2929172491858, 4396774031360, 1}}}, __mask_was_saved = 0}}, __pad = {0x3ffb4a77a60, 0x0, 0x0, 0x0}}
> > >         __cancel_routine = 0x2aa00819bf0 <qemu_thread_atexit_notify>
> > >         __not_first_call = <optimized out>
> > >         start_routine = 0x2aa004e08f0 <migration_thread>
> > >         arg = 0x2aa00d1b4e0
> > >         r = <optimized out>
> > > #8  0x000003ffb7b1e2e6 in start_thread () at /lib64/libc.so.6
> > > #9  0x000003ffb7aafdbe in thread_start () at /lib64/libc.so.6
> > > 
> > > It looks like it's in the preempt test:
> > > 
> > > (gdb) where
> > > #0  0x000003ffb17a0126 in __pthread_kill_implementation () from /lib64/libc.so.6
> > > #1  0x000003ffb1750890 in raise () from /lib64/libc.so.6
> > > #2  0x000003ffb172a340 in abort () from /lib64/libc.so.6
> > > #3  0x000002aa0041c130 in qtest_check_status (s=<optimized out>) at ../tests/qtest/libqtest.c:194
> > > #4  0x000003ffb1a3b5de in g_hook_list_invoke () from /lib64/libglib-2.0.so.0
> > > #5  <signal handler called>
> > > #6  0x000003ffb17a0126 in __pthread_kill_implementation () from /lib64/libc.so.6
> > > #7  0x000003ffb1750890 in raise () from /lib64/libc.so.6
> > > #8  0x000003ffb172a340 in abort () from /lib64/libc.so.6
> > > #9  0x000002aa00420318 in qmp_fd_receive (fd=<optimized out>) at ../tests/qtest/libqmp.c:80
> > > #10 0x000002aa0041d5ee in qtest_qmp_receive_dict (s=0x2aa01eb2700) at ../tests/qtest/libqtest.c:713
> > > #11 qtest_qmp_receive (s=0x2aa01eb2700) at ../tests/qtest/libqtest.c:701
> > > #12 qtest_vqmp (s=s@entry=0x2aa01eb2700, fmt=fmt@entry=0x2aa00487100 "{ 'execute': 'query-migrate' }", ap=ap@entry=0x3ffc247cc68)
> > >     at ../tests/qtest/libqtest.c:765
> > > #13 0x000002aa00413f1e in wait_command (who=who@entry=0x2aa01eb2700, command=command@entry=0x2aa00487100 "{ 'execute': 'query-migrate' }")
> > >     at ../tests/qtest/migration-helpers.c:73
> > > #14 0x000002aa00414078 in migrate_query (who=who@entry=0x2aa01eb2700) at ../tests/qtest/migration-helpers.c:139
> > > #15 migrate_query_status (who=who@entry=0x2aa01eb2700) at ../tests/qtest/migration-helpers.c:161
> > > #16 0x000002aa00414480 in check_migration_status (ungoals=0x0, goal=0x2aa00495c7e "completed", who=0x2aa01eb2700) at ../tests/qtest/migration-helpers.c:177
> > > #17 wait_for_migration_status (who=0x2aa01eb2700, goal=<optimized out>, ungoals=0x0) at ../tests/qtest/migration-helpers.c:202
> > > #18 0x000002aa0041300e in migrate_postcopy_complete (from=from@entry=0x2aa01eb2700, to=to@entry=0x2aa01eb3000, args=args@entry=0x3ffc247cf48)
> > >     at ../tests/qtest/migration-test.c:1137
> > > #19 0x000002aa004131a4 in test_postcopy_common (args=0x3ffc247cf48) at ../tests/qtest/migration-test.c:1162
> > > #20 test_postcopy_preempt () at ../tests/qtest/migration-test.c:1178
> > > 
> > > Looking at the iov and file it's garbage; so it makes me think this is
> > > something like a flush on a closed file.
> > 
> > I didn't figure out how that could be closed, but I think there's indeed a
> > possible race that the qemufile can be accessed by both the return path
> > thread and the migration thread concurrently, while qemufile is not thread
> > safe on that.
> > 
> > What postcopy_preempt_shutdown_file() does was: the src uses this EOS to
> > kick the dest QEMU preempt thread out of the migration and shut it off.
> > After some thought I think this is unnecessary complexity, since postcopy
> > should end at the point where dest received all the data, then it sends a
> > SHUT to src.  So potentially it's not good to have dest relying on anything
> > from src to shutdown anything (the preempt thread here) because it's the
> > dest qemu that makes the final decision to finish.  Ideally the preempt
> > thread on dest should be able to shutdown itself.
> > 
> > The trick here is preempt thread will block at read() (aka, recvmsg()) at
> > the channel at that time and the only way to kick it out from that is a
> > shutdown() on dest.  I attached a patch did it.  I'm not 100% sure whether
> > it'll already resolve our problem but worth trying.  This also made me
> > notice we forgot to enable SHUTDOWN feature on tls server when I was
> > running the patch 1 with qtest, so two patches needed.
> 
> Well, it seems to fix it:
> 
> Tested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> 
> Will this break interaction with an old qemu that's still waiting for
> the EOS?

Right.. when someone migrates from a 8.0+ QEMU (assuming this can land in
this release) to 7.2 QEMU with postcopy+preempt enabled.

I still see preempt full support only in 8.0; we did major rework in 8.0 to
shift to the return path.  So maybe it's worth the risk?  I can also add a
flag for this but it may add maintainance burden in general OTOH.

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: s390 migration crash
  2023-03-22 15:02     ` Peter Xu
@ 2023-03-22 18:13       ` Dr. David Alan Gilbert
  2023-03-22 19:16         ` Peter Xu
  0 siblings, 1 reply; 8+ messages in thread
From: Dr. David Alan Gilbert @ 2023-03-22 18:13 UTC (permalink / raw)
  To: Peter Xu; +Cc: peter.maydell, qemu-devel, quintela

* Peter Xu (peterx@redhat.com) wrote:
> On Wed, Mar 22, 2023 at 02:05:06PM +0000, Dr. David Alan Gilbert wrote:
> > * Peter Xu (peterx@redhat.com) wrote:
> > > On Tue, Mar 21, 2023 at 08:24:37PM +0000, Dr. David Alan Gilbert wrote:
> > > > Hi Peter's,
> > > >   Peter M pointed me to a seg in a migration test in CI; I can reproduce
> > > > it:
> > > >   * On an s390 host
> > > 
> > > How easy to reproduce?
> > 
> > Pretty much every time when run as:
> > make check -j 4
> > 
> > > >   * only as part of a make check - running migration-test by itself
> > > > doesn't trigger for me.
> > > >   * It looks like it's postcopy preempt
> > > > 
> > > > (gdb) bt full
> > > > #0  iov_size (iov=iov@entry=0x2aa00e60670, iov_cnt=<optimized out>) at ../util/iov.c:88
> > > >         len = 13517923312037845750
> > > >         i = 17305
> > > > #1  0x000002aa004d068c in qemu_fflush (f=0x2aa00e58630) at ../migration/qemu-file.c:307
> > > >         local_error = 0x0
> > > > #2  0x000002aa004d0e04 in qemu_fflush (f=<optimized out>) at ../migration/qemu-file.c:297
> > > > #3  0x000002aa00613962 in postcopy_preempt_shutdown_file (s=s@entry=0x2aa00d1b4e0) at ../migration/ram.c:4657
> > > > #4  0x000002aa004e12b4 in migration_completion (s=0x2aa00d1b4e0) at ../migration/migration.c:3469
> > > >         ret = <optimized out>
> > > >         current_active_state = 5
> > > >         must_precopy = 0
> > > >         can_postcopy = 0
> > > >         in_postcopy = true
> > > >         pending_size = 0
> > > >         __func__ = "migration_iteration_run"
> > > >         iter_state = <optimized out>
> > > >         s = 0x2aa00d1b4e0
> > > >         thread = <optimized out>
> > > >         setup_start = <optimized out>
> > > >         thr_error = <optimized out>
> > > >         urgent = <optimized out>
> > > > #5  migration_iteration_run (s=0x2aa00d1b4e0) at ../migration/migration.c:3882
> > > >         must_precopy = 0
> > > >         can_postcopy = 0
> > > >         in_postcopy = true
> > > >         pending_size = 0
> > > >         __func__ = "migration_iteration_run"
> > > >         iter_state = <optimized out>
> > > >         s = 0x2aa00d1b4e0
> > > >         thread = <optimized out>
> > > >         setup_start = <optimized out>
> > > >         thr_error = <optimized out>
> > > >         urgent = <optimized out>
> > > > #6  migration_thread (opaque=opaque@entry=0x2aa00d1b4e0) at ../migration/migration.c:4124
> > > >         iter_state = <optimized out>
> > > >         s = 0x2aa00d1b4e0
> > > > --Type <RET> for more, q to quit, c to continue without paging--
> > > >         thread = <optimized out>
> > > >         setup_start = <optimized out>
> > > >         thr_error = <optimized out>
> > > >         urgent = <optimized out>
> > > > #7  0x000002aa00819b8c in qemu_thread_start (args=<optimized out>) at ../util/qemu-thread-posix.c:541
> > > >         __cancel_buf = 
> > > >             {__cancel_jmp_buf = {{__cancel_jmp_buf = {{__gregs = {4396782422080, 4393751543808, 4397299389454, 4396844235904, 2929182727824, 2929182933488, 4396843986792, 4397299389455, 33679382915066768, 33678512846981306}, __fpregs = {4396774031360, 8392704, 2929182933488, 0, 4396782422272, 2929172491858, 4396774031360, 1}}}, __mask_was_saved = 0}}, __pad = {0x3ffb4a77a60, 0x0, 0x0, 0x0}}
> > > >         __cancel_routine = 0x2aa00819bf0 <qemu_thread_atexit_notify>
> > > >         __not_first_call = <optimized out>
> > > >         start_routine = 0x2aa004e08f0 <migration_thread>
> > > >         arg = 0x2aa00d1b4e0
> > > >         r = <optimized out>
> > > > #8  0x000003ffb7b1e2e6 in start_thread () at /lib64/libc.so.6
> > > > #9  0x000003ffb7aafdbe in thread_start () at /lib64/libc.so.6
> > > > 
> > > > It looks like it's in the preempt test:
> > > > 
> > > > (gdb) where
> > > > #0  0x000003ffb17a0126 in __pthread_kill_implementation () from /lib64/libc.so.6
> > > > #1  0x000003ffb1750890 in raise () from /lib64/libc.so.6
> > > > #2  0x000003ffb172a340 in abort () from /lib64/libc.so.6
> > > > #3  0x000002aa0041c130 in qtest_check_status (s=<optimized out>) at ../tests/qtest/libqtest.c:194
> > > > #4  0x000003ffb1a3b5de in g_hook_list_invoke () from /lib64/libglib-2.0.so.0
> > > > #5  <signal handler called>
> > > > #6  0x000003ffb17a0126 in __pthread_kill_implementation () from /lib64/libc.so.6
> > > > #7  0x000003ffb1750890 in raise () from /lib64/libc.so.6
> > > > #8  0x000003ffb172a340 in abort () from /lib64/libc.so.6
> > > > #9  0x000002aa00420318 in qmp_fd_receive (fd=<optimized out>) at ../tests/qtest/libqmp.c:80
> > > > #10 0x000002aa0041d5ee in qtest_qmp_receive_dict (s=0x2aa01eb2700) at ../tests/qtest/libqtest.c:713
> > > > #11 qtest_qmp_receive (s=0x2aa01eb2700) at ../tests/qtest/libqtest.c:701
> > > > #12 qtest_vqmp (s=s@entry=0x2aa01eb2700, fmt=fmt@entry=0x2aa00487100 "{ 'execute': 'query-migrate' }", ap=ap@entry=0x3ffc247cc68)
> > > >     at ../tests/qtest/libqtest.c:765
> > > > #13 0x000002aa00413f1e in wait_command (who=who@entry=0x2aa01eb2700, command=command@entry=0x2aa00487100 "{ 'execute': 'query-migrate' }")
> > > >     at ../tests/qtest/migration-helpers.c:73
> > > > #14 0x000002aa00414078 in migrate_query (who=who@entry=0x2aa01eb2700) at ../tests/qtest/migration-helpers.c:139
> > > > #15 migrate_query_status (who=who@entry=0x2aa01eb2700) at ../tests/qtest/migration-helpers.c:161
> > > > #16 0x000002aa00414480 in check_migration_status (ungoals=0x0, goal=0x2aa00495c7e "completed", who=0x2aa01eb2700) at ../tests/qtest/migration-helpers.c:177
> > > > #17 wait_for_migration_status (who=0x2aa01eb2700, goal=<optimized out>, ungoals=0x0) at ../tests/qtest/migration-helpers.c:202
> > > > #18 0x000002aa0041300e in migrate_postcopy_complete (from=from@entry=0x2aa01eb2700, to=to@entry=0x2aa01eb3000, args=args@entry=0x3ffc247cf48)
> > > >     at ../tests/qtest/migration-test.c:1137
> > > > #19 0x000002aa004131a4 in test_postcopy_common (args=0x3ffc247cf48) at ../tests/qtest/migration-test.c:1162
> > > > #20 test_postcopy_preempt () at ../tests/qtest/migration-test.c:1178
> > > > 
> > > > Looking at the iov and file it's garbage; so it makes me think this is
> > > > something like a flush on a closed file.
> > > 
> > > I didn't figure out how that could be closed, but I think there's indeed a
> > > possible race that the qemufile can be accessed by both the return path
> > > thread and the migration thread concurrently, while qemufile is not thread
> > > safe on that.
> > > 
> > > What postcopy_preempt_shutdown_file() does was: the src uses this EOS to
> > > kick the dest QEMU preempt thread out of the migration and shut it off.
> > > After some thought I think this is unnecessary complexity, since postcopy
> > > should end at the point where dest received all the data, then it sends a
> > > SHUT to src.  So potentially it's not good to have dest relying on anything
> > > from src to shutdown anything (the preempt thread here) because it's the
> > > dest qemu that makes the final decision to finish.  Ideally the preempt
> > > thread on dest should be able to shutdown itself.
> > > 
> > > The trick here is preempt thread will block at read() (aka, recvmsg()) at
> > > the channel at that time and the only way to kick it out from that is a
> > > shutdown() on dest.  I attached a patch did it.  I'm not 100% sure whether
> > > it'll already resolve our problem but worth trying.  This also made me
> > > notice we forgot to enable SHUTDOWN feature on tls server when I was
> > > running the patch 1 with qtest, so two patches needed.
> > 
> > Well, it seems to fix it:
> > 
> > Tested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > 
> > Will this break interaction with an old qemu that's still waiting for
> > the EOS?
> 
> Right.. when someone migrates from a 8.0+ QEMU (assuming this can land in
> this release) to 7.2 QEMU with postcopy+preempt enabled.

So why not carry on sending the EOS?

> I still see preempt full support only in 8.0; we did major rework in 8.0 to
> shift to the return path.  So maybe it's worth the risk?  I can also add a
> flag for this but it may add maintainance burden in general OTOH.

If you're saying the pre 8.0 never worked then I guess that's fine;
it's tricky from say a libvirt or higher level tool to know what it can
use.

Dave

> -- 
> Peter Xu
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: s390 migration crash
  2023-03-22 18:13       ` Dr. David Alan Gilbert
@ 2023-03-22 19:16         ` Peter Xu
  2023-03-26 16:49           ` Peter Xu
  0 siblings, 1 reply; 8+ messages in thread
From: Peter Xu @ 2023-03-22 19:16 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: peter.maydell, qemu-devel, quintela

On Wed, Mar 22, 2023 at 06:13:43PM +0000, Dr. David Alan Gilbert wrote:
> * Peter Xu (peterx@redhat.com) wrote:
> > On Wed, Mar 22, 2023 at 02:05:06PM +0000, Dr. David Alan Gilbert wrote:
> > > * Peter Xu (peterx@redhat.com) wrote:
> > > > On Tue, Mar 21, 2023 at 08:24:37PM +0000, Dr. David Alan Gilbert wrote:
> > > > > Hi Peter's,
> > > > >   Peter M pointed me to a seg in a migration test in CI; I can reproduce
> > > > > it:
> > > > >   * On an s390 host
> > > > 
> > > > How easy to reproduce?
> > > 
> > > Pretty much every time when run as:
> > > make check -j 4
> > > 
> > > > >   * only as part of a make check - running migration-test by itself
> > > > > doesn't trigger for me.
> > > > >   * It looks like it's postcopy preempt
> > > > > 
> > > > > (gdb) bt full
> > > > > #0  iov_size (iov=iov@entry=0x2aa00e60670, iov_cnt=<optimized out>) at ../util/iov.c:88
> > > > >         len = 13517923312037845750
> > > > >         i = 17305
> > > > > #1  0x000002aa004d068c in qemu_fflush (f=0x2aa00e58630) at ../migration/qemu-file.c:307
> > > > >         local_error = 0x0
> > > > > #2  0x000002aa004d0e04 in qemu_fflush (f=<optimized out>) at ../migration/qemu-file.c:297
> > > > > #3  0x000002aa00613962 in postcopy_preempt_shutdown_file (s=s@entry=0x2aa00d1b4e0) at ../migration/ram.c:4657
> > > > > #4  0x000002aa004e12b4 in migration_completion (s=0x2aa00d1b4e0) at ../migration/migration.c:3469
> > > > >         ret = <optimized out>
> > > > >         current_active_state = 5
> > > > >         must_precopy = 0
> > > > >         can_postcopy = 0
> > > > >         in_postcopy = true
> > > > >         pending_size = 0
> > > > >         __func__ = "migration_iteration_run"
> > > > >         iter_state = <optimized out>
> > > > >         s = 0x2aa00d1b4e0
> > > > >         thread = <optimized out>
> > > > >         setup_start = <optimized out>
> > > > >         thr_error = <optimized out>
> > > > >         urgent = <optimized out>
> > > > > #5  migration_iteration_run (s=0x2aa00d1b4e0) at ../migration/migration.c:3882
> > > > >         must_precopy = 0
> > > > >         can_postcopy = 0
> > > > >         in_postcopy = true
> > > > >         pending_size = 0
> > > > >         __func__ = "migration_iteration_run"
> > > > >         iter_state = <optimized out>
> > > > >         s = 0x2aa00d1b4e0
> > > > >         thread = <optimized out>
> > > > >         setup_start = <optimized out>
> > > > >         thr_error = <optimized out>
> > > > >         urgent = <optimized out>
> > > > > #6  migration_thread (opaque=opaque@entry=0x2aa00d1b4e0) at ../migration/migration.c:4124
> > > > >         iter_state = <optimized out>
> > > > >         s = 0x2aa00d1b4e0
> > > > > --Type <RET> for more, q to quit, c to continue without paging--
> > > > >         thread = <optimized out>
> > > > >         setup_start = <optimized out>
> > > > >         thr_error = <optimized out>
> > > > >         urgent = <optimized out>
> > > > > #7  0x000002aa00819b8c in qemu_thread_start (args=<optimized out>) at ../util/qemu-thread-posix.c:541
> > > > >         __cancel_buf = 
> > > > >             {__cancel_jmp_buf = {{__cancel_jmp_buf = {{__gregs = {4396782422080, 4393751543808, 4397299389454, 4396844235904, 2929182727824, 2929182933488, 4396843986792, 4397299389455, 33679382915066768, 33678512846981306}, __fpregs = {4396774031360, 8392704, 2929182933488, 0, 4396782422272, 2929172491858, 4396774031360, 1}}}, __mask_was_saved = 0}}, __pad = {0x3ffb4a77a60, 0x0, 0x0, 0x0}}
> > > > >         __cancel_routine = 0x2aa00819bf0 <qemu_thread_atexit_notify>
> > > > >         __not_first_call = <optimized out>
> > > > >         start_routine = 0x2aa004e08f0 <migration_thread>
> > > > >         arg = 0x2aa00d1b4e0
> > > > >         r = <optimized out>
> > > > > #8  0x000003ffb7b1e2e6 in start_thread () at /lib64/libc.so.6
> > > > > #9  0x000003ffb7aafdbe in thread_start () at /lib64/libc.so.6
> > > > > 
> > > > > It looks like it's in the preempt test:
> > > > > 
> > > > > (gdb) where
> > > > > #0  0x000003ffb17a0126 in __pthread_kill_implementation () from /lib64/libc.so.6
> > > > > #1  0x000003ffb1750890 in raise () from /lib64/libc.so.6
> > > > > #2  0x000003ffb172a340 in abort () from /lib64/libc.so.6
> > > > > #3  0x000002aa0041c130 in qtest_check_status (s=<optimized out>) at ../tests/qtest/libqtest.c:194
> > > > > #4  0x000003ffb1a3b5de in g_hook_list_invoke () from /lib64/libglib-2.0.so.0
> > > > > #5  <signal handler called>
> > > > > #6  0x000003ffb17a0126 in __pthread_kill_implementation () from /lib64/libc.so.6
> > > > > #7  0x000003ffb1750890 in raise () from /lib64/libc.so.6
> > > > > #8  0x000003ffb172a340 in abort () from /lib64/libc.so.6
> > > > > #9  0x000002aa00420318 in qmp_fd_receive (fd=<optimized out>) at ../tests/qtest/libqmp.c:80
> > > > > #10 0x000002aa0041d5ee in qtest_qmp_receive_dict (s=0x2aa01eb2700) at ../tests/qtest/libqtest.c:713
> > > > > #11 qtest_qmp_receive (s=0x2aa01eb2700) at ../tests/qtest/libqtest.c:701
> > > > > #12 qtest_vqmp (s=s@entry=0x2aa01eb2700, fmt=fmt@entry=0x2aa00487100 "{ 'execute': 'query-migrate' }", ap=ap@entry=0x3ffc247cc68)
> > > > >     at ../tests/qtest/libqtest.c:765
> > > > > #13 0x000002aa00413f1e in wait_command (who=who@entry=0x2aa01eb2700, command=command@entry=0x2aa00487100 "{ 'execute': 'query-migrate' }")
> > > > >     at ../tests/qtest/migration-helpers.c:73
> > > > > #14 0x000002aa00414078 in migrate_query (who=who@entry=0x2aa01eb2700) at ../tests/qtest/migration-helpers.c:139
> > > > > #15 migrate_query_status (who=who@entry=0x2aa01eb2700) at ../tests/qtest/migration-helpers.c:161
> > > > > #16 0x000002aa00414480 in check_migration_status (ungoals=0x0, goal=0x2aa00495c7e "completed", who=0x2aa01eb2700) at ../tests/qtest/migration-helpers.c:177
> > > > > #17 wait_for_migration_status (who=0x2aa01eb2700, goal=<optimized out>, ungoals=0x0) at ../tests/qtest/migration-helpers.c:202
> > > > > #18 0x000002aa0041300e in migrate_postcopy_complete (from=from@entry=0x2aa01eb2700, to=to@entry=0x2aa01eb3000, args=args@entry=0x3ffc247cf48)
> > > > >     at ../tests/qtest/migration-test.c:1137
> > > > > #19 0x000002aa004131a4 in test_postcopy_common (args=0x3ffc247cf48) at ../tests/qtest/migration-test.c:1162
> > > > > #20 test_postcopy_preempt () at ../tests/qtest/migration-test.c:1178
> > > > > 
> > > > > Looking at the iov and file it's garbage; so it makes me think this is
> > > > > something like a flush on a closed file.
> > > > 
> > > > I didn't figure out how that could be closed, but I think there's indeed a
> > > > possible race that the qemufile can be accessed by both the return path
> > > > thread and the migration thread concurrently, while qemufile is not thread
> > > > safe on that.
> > > > 
> > > > What postcopy_preempt_shutdown_file() does was: the src uses this EOS to
> > > > kick the dest QEMU preempt thread out of the migration and shut it off.
> > > > After some thought I think this is unnecessary complexity, since postcopy
> > > > should end at the point where dest received all the data, then it sends a
> > > > SHUT to src.  So potentially it's not good to have dest relying on anything
> > > > from src to shutdown anything (the preempt thread here) because it's the
> > > > dest qemu that makes the final decision to finish.  Ideally the preempt
> > > > thread on dest should be able to shutdown itself.
> > > > 
> > > > The trick here is preempt thread will block at read() (aka, recvmsg()) at
> > > > the channel at that time and the only way to kick it out from that is a
> > > > shutdown() on dest.  I attached a patch did it.  I'm not 100% sure whether
> > > > it'll already resolve our problem but worth trying.  This also made me
> > > > notice we forgot to enable SHUTDOWN feature on tls server when I was
> > > > running the patch 1 with qtest, so two patches needed.
> > > 
> > > Well, it seems to fix it:
> > > 
> > > Tested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > > 
> > > Will this break interaction with an old qemu that's still waiting for
> > > the EOS?
> > 
> > Right.. when someone migrates from a 8.0+ QEMU (assuming this can land in
> > this release) to 7.2 QEMU with postcopy+preempt enabled.
> 
> So why not carry on sending the EOS?

Because it's still racy to send it so potentially such a migration is buggy
and can crash?  Or maybe you meant we could serialize the sends (e.g. with
some locks) so we keep the send but avoid the crash (hopefully!).  Then the
lock will only be used on old qemu binaries.

> 
> > I still see preempt full support only in 8.0; we did major rework in 8.0 to
> > shift to the return path.  So maybe it's worth the risk?  I can also add a
> > flag for this but it may add maintainance burden in general OTOH.
> 
> If you're saying the pre 8.0 never worked then I guess that's fine;

I think pre-8.0 works too.  The rework shouldn't have changed that fact or
the stream protocol.

> it's tricky from say a libvirt or higher level tool to know what it can
> use.

Libvirt hasn't yet support preempt mode, but yeah it'll be a problem too in
the futhre.

So maybe I introduce a flag, send this EOS conditionally, but ignoring the
possibility of crash?

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: s390 migration crash
  2023-03-22  0:19 ` Peter Xu
  2023-03-22 14:05   ` Dr. David Alan Gilbert
@ 2023-03-22 19:21   ` Daniel P. Berrangé
  1 sibling, 0 replies; 8+ messages in thread
From: Daniel P. Berrangé @ 2023-03-22 19:21 UTC (permalink / raw)
  To: Peter Xu; +Cc: Dr. David Alan Gilbert, peter.maydell, qemu-devel, quintela

On Tue, Mar 21, 2023 at 08:19:00PM -0400, Peter Xu wrote:
> On Tue, Mar 21, 2023 at 08:24:37PM +0000, Dr. David Alan Gilbert wrote:
> > Hi Peter's,
> >   Peter M pointed me to a seg in a migration test in CI; I can reproduce
> > it:
> >   * On an s390 host
> 
> How easy to reproduce?
> 
> >   * only as part of a make check - running migration-test by itself
> > doesn't trigger for me.
> >   * It looks like it's postcopy preempt

snip

> > Looking at the iov and file it's garbage; so it makes me think this is
> > something like a flush on a closed file.
> 
> I didn't figure out how that could be closed, but I think there's indeed a
> possible race that the qemufile can be accessed by both the return path
> thread and the migration thread concurrently, while qemufile is not thread
> safe on that.

snip

> From 0e317fa78e9671c119f6be78a0e0a36201517dc2 Mon Sep 17 00:00:00 2001
> From: Peter Xu <peterx@redhat.com>
> Date: Tue, 21 Mar 2023 19:58:42 -0400
> Subject: [PATCH 1/2] io: tls: Inherit QIO_CHANNEL_FEATURE_SHUTDOWN on server
>  side
> 
> TLS iochannel will inherit io_shutdown() from the master ioc, however we
> missed to do that on the server side.
> 
> This will e.g. allow qemu_file_shutdown() to work on dest QEMU too for
> migration.
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  io/channel-tls.c | 3 +++
>  1 file changed, 3 insertions(+)

Acked-by: Daniel P. Berrangé <berrange@redhat.com>

> 
> diff --git a/io/channel-tls.c b/io/channel-tls.c
> index 5a7a3d48d6..9805dd0a3f 100644
> --- a/io/channel-tls.c
> +++ b/io/channel-tls.c
> @@ -74,6 +74,9 @@ qio_channel_tls_new_server(QIOChannel *master,
>      ioc = QIO_CHANNEL_TLS(object_new(TYPE_QIO_CHANNEL_TLS));
>  
>      ioc->master = master;
> +    if (qio_channel_has_feature(master, QIO_CHANNEL_FEATURE_SHUTDOWN)) {
> +        qio_channel_set_feature(QIO_CHANNEL(ioc), QIO_CHANNEL_FEATURE_SHUTDOWN);
> +    }
>      object_ref(OBJECT(master));
>  
>      ioc->session = qcrypto_tls_session_new(



With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: s390 migration crash
  2023-03-22 19:16         ` Peter Xu
@ 2023-03-26 16:49           ` Peter Xu
  0 siblings, 0 replies; 8+ messages in thread
From: Peter Xu @ 2023-03-26 16:49 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: peter.maydell, qemu-devel, quintela

On Wed, Mar 22, 2023 at 03:16:23PM -0400, Peter Xu wrote:
> On Wed, Mar 22, 2023 at 06:13:43PM +0000, Dr. David Alan Gilbert wrote:
> > * Peter Xu (peterx@redhat.com) wrote:
> > > On Wed, Mar 22, 2023 at 02:05:06PM +0000, Dr. David Alan Gilbert wrote:
> > > > * Peter Xu (peterx@redhat.com) wrote:
> > > > > On Tue, Mar 21, 2023 at 08:24:37PM +0000, Dr. David Alan Gilbert wrote:
> > > > > > Hi Peter's,
> > > > > >   Peter M pointed me to a seg in a migration test in CI; I can reproduce
> > > > > > it:
> > > > > >   * On an s390 host
> > > > > 
> > > > > How easy to reproduce?
> > > > 
> > > > Pretty much every time when run as:
> > > > make check -j 4
> > > > 
> > > > > >   * only as part of a make check - running migration-test by itself
> > > > > > doesn't trigger for me.
> > > > > >   * It looks like it's postcopy preempt
> > > > > > 
> > > > > > (gdb) bt full
> > > > > > #0  iov_size (iov=iov@entry=0x2aa00e60670, iov_cnt=<optimized out>) at ../util/iov.c:88
> > > > > >         len = 13517923312037845750
> > > > > >         i = 17305
> > > > > > #1  0x000002aa004d068c in qemu_fflush (f=0x2aa00e58630) at ../migration/qemu-file.c:307
> > > > > >         local_error = 0x0
> > > > > > #2  0x000002aa004d0e04 in qemu_fflush (f=<optimized out>) at ../migration/qemu-file.c:297
> > > > > > #3  0x000002aa00613962 in postcopy_preempt_shutdown_file (s=s@entry=0x2aa00d1b4e0) at ../migration/ram.c:4657
> > > > > > #4  0x000002aa004e12b4 in migration_completion (s=0x2aa00d1b4e0) at ../migration/migration.c:3469
> > > > > >         ret = <optimized out>
> > > > > >         current_active_state = 5
> > > > > >         must_precopy = 0
> > > > > >         can_postcopy = 0
> > > > > >         in_postcopy = true
> > > > > >         pending_size = 0
> > > > > >         __func__ = "migration_iteration_run"
> > > > > >         iter_state = <optimized out>
> > > > > >         s = 0x2aa00d1b4e0
> > > > > >         thread = <optimized out>
> > > > > >         setup_start = <optimized out>
> > > > > >         thr_error = <optimized out>
> > > > > >         urgent = <optimized out>
> > > > > > #5  migration_iteration_run (s=0x2aa00d1b4e0) at ../migration/migration.c:3882
> > > > > >         must_precopy = 0
> > > > > >         can_postcopy = 0
> > > > > >         in_postcopy = true
> > > > > >         pending_size = 0
> > > > > >         __func__ = "migration_iteration_run"
> > > > > >         iter_state = <optimized out>
> > > > > >         s = 0x2aa00d1b4e0
> > > > > >         thread = <optimized out>
> > > > > >         setup_start = <optimized out>
> > > > > >         thr_error = <optimized out>
> > > > > >         urgent = <optimized out>
> > > > > > #6  migration_thread (opaque=opaque@entry=0x2aa00d1b4e0) at ../migration/migration.c:4124
> > > > > >         iter_state = <optimized out>
> > > > > >         s = 0x2aa00d1b4e0
> > > > > > --Type <RET> for more, q to quit, c to continue without paging--
> > > > > >         thread = <optimized out>
> > > > > >         setup_start = <optimized out>
> > > > > >         thr_error = <optimized out>
> > > > > >         urgent = <optimized out>
> > > > > > #7  0x000002aa00819b8c in qemu_thread_start (args=<optimized out>) at ../util/qemu-thread-posix.c:541
> > > > > >         __cancel_buf = 
> > > > > >             {__cancel_jmp_buf = {{__cancel_jmp_buf = {{__gregs = {4396782422080, 4393751543808, 4397299389454, 4396844235904, 2929182727824, 2929182933488, 4396843986792, 4397299389455, 33679382915066768, 33678512846981306}, __fpregs = {4396774031360, 8392704, 2929182933488, 0, 4396782422272, 2929172491858, 4396774031360, 1}}}, __mask_was_saved = 0}}, __pad = {0x3ffb4a77a60, 0x0, 0x0, 0x0}}
> > > > > >         __cancel_routine = 0x2aa00819bf0 <qemu_thread_atexit_notify>
> > > > > >         __not_first_call = <optimized out>
> > > > > >         start_routine = 0x2aa004e08f0 <migration_thread>
> > > > > >         arg = 0x2aa00d1b4e0
> > > > > >         r = <optimized out>
> > > > > > #8  0x000003ffb7b1e2e6 in start_thread () at /lib64/libc.so.6
> > > > > > #9  0x000003ffb7aafdbe in thread_start () at /lib64/libc.so.6
> > > > > > 
> > > > > > It looks like it's in the preempt test:
> > > > > > 
> > > > > > (gdb) where
> > > > > > #0  0x000003ffb17a0126 in __pthread_kill_implementation () from /lib64/libc.so.6
> > > > > > #1  0x000003ffb1750890 in raise () from /lib64/libc.so.6
> > > > > > #2  0x000003ffb172a340 in abort () from /lib64/libc.so.6
> > > > > > #3  0x000002aa0041c130 in qtest_check_status (s=<optimized out>) at ../tests/qtest/libqtest.c:194
> > > > > > #4  0x000003ffb1a3b5de in g_hook_list_invoke () from /lib64/libglib-2.0.so.0
> > > > > > #5  <signal handler called>
> > > > > > #6  0x000003ffb17a0126 in __pthread_kill_implementation () from /lib64/libc.so.6
> > > > > > #7  0x000003ffb1750890 in raise () from /lib64/libc.so.6
> > > > > > #8  0x000003ffb172a340 in abort () from /lib64/libc.so.6
> > > > > > #9  0x000002aa00420318 in qmp_fd_receive (fd=<optimized out>) at ../tests/qtest/libqmp.c:80
> > > > > > #10 0x000002aa0041d5ee in qtest_qmp_receive_dict (s=0x2aa01eb2700) at ../tests/qtest/libqtest.c:713
> > > > > > #11 qtest_qmp_receive (s=0x2aa01eb2700) at ../tests/qtest/libqtest.c:701
> > > > > > #12 qtest_vqmp (s=s@entry=0x2aa01eb2700, fmt=fmt@entry=0x2aa00487100 "{ 'execute': 'query-migrate' }", ap=ap@entry=0x3ffc247cc68)
> > > > > >     at ../tests/qtest/libqtest.c:765
> > > > > > #13 0x000002aa00413f1e in wait_command (who=who@entry=0x2aa01eb2700, command=command@entry=0x2aa00487100 "{ 'execute': 'query-migrate' }")
> > > > > >     at ../tests/qtest/migration-helpers.c:73
> > > > > > #14 0x000002aa00414078 in migrate_query (who=who@entry=0x2aa01eb2700) at ../tests/qtest/migration-helpers.c:139
> > > > > > #15 migrate_query_status (who=who@entry=0x2aa01eb2700) at ../tests/qtest/migration-helpers.c:161
> > > > > > #16 0x000002aa00414480 in check_migration_status (ungoals=0x0, goal=0x2aa00495c7e "completed", who=0x2aa01eb2700) at ../tests/qtest/migration-helpers.c:177
> > > > > > #17 wait_for_migration_status (who=0x2aa01eb2700, goal=<optimized out>, ungoals=0x0) at ../tests/qtest/migration-helpers.c:202
> > > > > > #18 0x000002aa0041300e in migrate_postcopy_complete (from=from@entry=0x2aa01eb2700, to=to@entry=0x2aa01eb3000, args=args@entry=0x3ffc247cf48)
> > > > > >     at ../tests/qtest/migration-test.c:1137
> > > > > > #19 0x000002aa004131a4 in test_postcopy_common (args=0x3ffc247cf48) at ../tests/qtest/migration-test.c:1162
> > > > > > #20 test_postcopy_preempt () at ../tests/qtest/migration-test.c:1178
> > > > > > 
> > > > > > Looking at the iov and file it's garbage; so it makes me think this is
> > > > > > something like a flush on a closed file.
> > > > > 
> > > > > I didn't figure out how that could be closed, but I think there's indeed a
> > > > > possible race that the qemufile can be accessed by both the return path
> > > > > thread and the migration thread concurrently, while qemufile is not thread
> > > > > safe on that.
> > > > > 
> > > > > What postcopy_preempt_shutdown_file() does was: the src uses this EOS to
> > > > > kick the dest QEMU preempt thread out of the migration and shut it off.
> > > > > After some thought I think this is unnecessary complexity, since postcopy
> > > > > should end at the point where dest received all the data, then it sends a
> > > > > SHUT to src.  So potentially it's not good to have dest relying on anything
> > > > > from src to shutdown anything (the preempt thread here) because it's the
> > > > > dest qemu that makes the final decision to finish.  Ideally the preempt
> > > > > thread on dest should be able to shutdown itself.
> > > > > 
> > > > > The trick here is preempt thread will block at read() (aka, recvmsg()) at
> > > > > the channel at that time and the only way to kick it out from that is a
> > > > > shutdown() on dest.  I attached a patch did it.  I'm not 100% sure whether
> > > > > it'll already resolve our problem but worth trying.  This also made me
> > > > > notice we forgot to enable SHUTDOWN feature on tls server when I was
> > > > > running the patch 1 with qtest, so two patches needed.
> > > > 
> > > > Well, it seems to fix it:
> > > > 
> > > > Tested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > > > 
> > > > Will this break interaction with an old qemu that's still waiting for
> > > > the EOS?
> > > 
> > > Right.. when someone migrates from a 8.0+ QEMU (assuming this can land in
> > > this release) to 7.2 QEMU with postcopy+preempt enabled.
> > 
> > So why not carry on sending the EOS?
> 
> Because it's still racy to send it so potentially such a migration is buggy
> and can crash?  Or maybe you meant we could serialize the sends (e.g. with
> some locks) so we keep the send but avoid the crash (hopefully!).  Then the
> lock will only be used on old qemu binaries.
> 
> > 
> > > I still see preempt full support only in 8.0; we did major rework in 8.0 to
> > > shift to the return path.  So maybe it's worth the risk?  I can also add a
> > > flag for this but it may add maintainance burden in general OTOH.
> > 
> > If you're saying the pre 8.0 never worked then I guess that's fine;
> 
> I think pre-8.0 works too.  The rework shouldn't have changed that fact or
> the stream protocol.
> 
> > it's tricky from say a libvirt or higher level tool to know what it can
> > use.
> 
> Libvirt hasn't yet support preempt mode, but yeah it'll be a problem too in
> the futhre.
> 
> So maybe I introduce a flag, send this EOS conditionally, but ignoring the
> possibility of crash?

Here's a follow up on this one.

I think we need to have this resolved sooner or later, if we want this for
8.0 then we'll need it fast, because at least the current solution will
break compatibility.

We can also postpone the fix later so no rush for 8.0, but unfortunately
when I was testing this patch (with a new internal flag for pre-7.2 as
addition) I found that qemu already broke with preempt 8.0 -> 7.2-bin
because of the recent rework on channel reorders... that changed when the
src/dst will establish the preempt channel to avoid race conditions,
however it also means a migration from 8.0 -> 7.2 will hang because 7.2
will try to wait for preempt channel while 8.0 will not establish that, not
until switching to postcopy. :(

So, here's my take on a whole solution of the issues: we try to not only
fixup this rare hang with the additional flag, also use that additional
flag to fix the breakage of above.  Luckily we're still in 8.0 window so we
still have chance to fix that.  Let's see whether we can still make it.

In all cases, I'll prepare patches today very soon and marking it for 8.0
material.  I'll test 8.0 <-> 7.2 migrations too with preempt enabled, then
we'll move discussion there.

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-03-26 16:50 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-03-21 20:24 s390 migration crash Dr. David Alan Gilbert
2023-03-22  0:19 ` Peter Xu
2023-03-22 14:05   ` Dr. David Alan Gilbert
2023-03-22 15:02     ` Peter Xu
2023-03-22 18:13       ` Dr. David Alan Gilbert
2023-03-22 19:16         ` Peter Xu
2023-03-26 16:49           ` Peter Xu
2023-03-22 19:21   ` Daniel P. Berrangé

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).