[PULL 00/17] Migration patches for 2024-12-17

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PULL 00/17] Migration patches for 2024-12-17
@ 2024-12-17 17:48 Fabiano Rosas
  2024-12-17 17:48 ` [PULL 01/17] migration/multifd: Fix compile error caused by page_size usage Fabiano Rosas
                   ` (17 more replies)
  0 siblings, 18 replies; 29+ messages in thread
From: Fabiano Rosas @ 2024-12-17 17:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu

The following changes since commit 8032c78e556cd0baec111740a6c636863f9bd7c8:

  Merge tag 'firmware-20241216-pull-request' of https://gitlab.com/kraxel/qemu into staging (2024-12-16 14:20:33 -0500)

are available in the Git repository at:

  https://gitlab.com/farosas/qemu.git tags/migration-20241217-pull-request

for you to fetch changes up to 1bed6df0c71d3a74286f53c01fafd21fde8777f4:

  tests/qtest/migration: Fix compile errors when CONFIG_UADK is set (2024-12-17 13:51:19 -0300)

----------------------------------------------------------------
Migration pull request

- Shameer's fixes for CONFIG_UADK code

- Peter's multifd sync cleanups, prereq. for VFIO and postcopy work

- Fabiano's fix for multifd regression in pre-9.0 -> post-9.1
  migrations (#2720)

- Fabiano's fix for s390x migration regression (#2704)

- Peter's fix for assertions during paused migrations; reworks
  late-block-activate logic (#2395, #686)

----------------------------------------------------------------

Fabiano Rosas (2):
  migration/multifd: Fix compat with QEMU < 9.0
  s390x: Fix CSS migration

Peter Xu (13):
  migration/multifd: Further remove the SYNC on complete
  migration/multifd: Allow to sync with sender threads only
  migration/ram: Move RAM_SAVE_FLAG* into ram.h
  migration/multifd: Unify RAM_SAVE_FLAG_MULTIFD_FLUSH messages
  migration/multifd: Remove sync processing on postcopy
  migration/multifd: Cleanup src flushes on condition check
  migration/multifd: Document the reason to sync for save_setup()
  migration: Add helper to get target runstate
  qmp/cont: Only activate disks if migration completed
  migration/block: Make late-block-active the default
  migration/block: Apply late-block-active behavior to postcopy
  migration/block: Fix possible race with block_inactive
  migration/block: Rewrite disk activation

Shameer Kolothum (2):
  migration/multifd: Fix compile error caused by page_size usage
  tests/qtest/migration: Fix compile errors when CONFIG_UADK is set

 hw/s390x/s390-virtio-ccw.c                |   2 +-
 include/migration/misc.h                  |   4 +
 migration/block-active.c                  |  94 +++++++++++++++
 migration/colo.c                          |   2 +-
 migration/meson.build                     |   1 +
 migration/migration.c                     | 136 +++++++++-------------
 migration/migration.h                     |   6 +-
 migration/multifd-nocomp.c                |  74 +++++++++++-
 migration/multifd-uadk.c                  |   2 +-
 migration/multifd.c                       |  32 +++--
 migration/multifd.h                       |  27 ++++-
 migration/ram.c                           |  89 +++++++-------
 migration/ram.h                           |  28 +++++
 migration/rdma.h                          |   7 --
 migration/savevm.c                        |  46 ++++----
 migration/trace-events                    |   3 +
 monitor/qmp-cmds.c                        |  22 ++--
 tests/qtest/migration/compression-tests.c |  54 ---------
 18 files changed, 372 insertions(+), 257 deletions(-)
 create mode 100644 migration/block-active.c

-- 
2.35.3



^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PULL 01/17] migration/multifd: Fix compile error caused by page_size usage
  2024-12-17 17:48 [PULL 00/17] Migration patches for 2024-12-17 Fabiano Rosas
@ 2024-12-17 17:48 ` Fabiano Rosas
  2024-12-17 17:48 ` [PULL 02/17] migration/multifd: Further remove the SYNC on complete Fabiano Rosas
                   ` (16 subsequent siblings)
  17 siblings, 0 replies; 29+ messages in thread
From: Fabiano Rosas @ 2024-12-17 17:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu, Shameer Kolothum

From: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>

>From Commit 90fa121c6c07 ("migration/multifd: Inline page_size and
page_count") onwards page_size is not part of MutiFD*Params but uses
an inline constant instead.

However, it missed updating an old usage, causing a compile error.

Fixes: 90fa121c6c07 ("migration/multifd: Inline page_size and page_count")
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Message-Id: <20241203124943.52572-1-shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 migration/multifd-uadk.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/migration/multifd-uadk.c b/migration/multifd-uadk.c
index 6e6a290ae9..6895c1f65a 100644
--- a/migration/multifd-uadk.c
+++ b/migration/multifd-uadk.c
@@ -169,7 +169,7 @@ static int multifd_uadk_send_prepare(MultiFDSendParams *p, Error **errp)
             .src_len = page_size,
             .dst     = buf,
             /* Set dst_len to double the src in case compressed out >= page_size */
-            .dst_len = p->page_size * 2,
+            .dst_len = page_size * 2,
         };
 
         if (uadk_data->handle) {
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PULL 02/17] migration/multifd: Further remove the SYNC on complete
  2024-12-17 17:48 [PULL 00/17] Migration patches for 2024-12-17 Fabiano Rosas
  2024-12-17 17:48 ` [PULL 01/17] migration/multifd: Fix compile error caused by page_size usage Fabiano Rosas
@ 2024-12-17 17:48 ` Fabiano Rosas
  2024-12-17 17:48 ` [PULL 03/17] migration/multifd: Allow to sync with sender threads only Fabiano Rosas
                   ` (15 subsequent siblings)
  17 siblings, 0 replies; 29+ messages in thread
From: Fabiano Rosas @ 2024-12-17 17:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu

From: Peter Xu <peterx@redhat.com>

Commit 637280aeb2 ("migration/multifd: Avoid the final FLUSH in
complete()") stopped sending the RAM_SAVE_FLAG_MULTIFD_FLUSH flag at
ram_save_complete(), because the sync on the destination side is not
needed due to the last iteration of find_dirty_block() having already
done it.

However, that commit overlooked that multifd_ram_flush_and_sync() on the
source side is also not needed at ram_save_complete(), for the same
reason.

Moreover, removing the RAM_SAVE_FLAG_MULTIFD_FLUSH but keeping the
multifd_ram_flush_and_sync() means that currently the recv threads will
hang when receiving the MULTIFD_FLAG_SYNC message, waiting for the
destination sync which only happens when RAM_SAVE_FLAG_MULTIFD_FLUSH is
received.

Luckily, multifd is still all working fine because recv side cleanup
code (mostly multifd_recv_sync_main()) is smart enough to make sure even
if recv threads are stuck at SYNC it'll get kicked out. And since this
is the completion phase of migration, nothing else will be sent after
the SYNCs.

This needs to be fixed because in the future VFIO will have data to push
after ram_save_complete() and we don't want the recv thread to be stuck
in the MULTIFD_FLAG_SYNC message.

Remove the unnecessary (and buggy) invocation of
multifd_ram_flush_and_sync().

For very old binaries (multifd_flush_after_each_section==true), the
flush_and_sync is still needed because each EOS received on destination
will enforce all-channel sync once.

Stable branches do not need this patch, as no real bug I can think of
that will go wrong there.. so not attaching Fixes to be clear on the
backport not needed.

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20241206224755.1108686-2-peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 migration/ram.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 05ff9eb328..7284c34bd8 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -3283,9 +3283,16 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
         }
     }

-    ret = multifd_ram_flush_and_sync();
-    if (ret < 0) {
-        return ret;
+    if (migrate_multifd() &&
+        migrate_multifd_flush_after_each_section()) {
+        /*
+         * Only the old dest QEMU will need this sync, because each EOS
+         * will require one SYNC message on each channel.
+         */
+        ret = multifd_ram_flush_and_sync();
+        if (ret < 0) {
+            return ret;
+        }
     }

     if (migrate_mapped_ram()) {
-- 
2.35.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PULL 03/17] migration/multifd: Allow to sync with sender threads only
  2024-12-17 17:48 [PULL 00/17] Migration patches for 2024-12-17 Fabiano Rosas
  2024-12-17 17:48 ` [PULL 01/17] migration/multifd: Fix compile error caused by page_size usage Fabiano Rosas
  2024-12-17 17:48 ` [PULL 02/17] migration/multifd: Further remove the SYNC on complete Fabiano Rosas
@ 2024-12-17 17:48 ` Fabiano Rosas
  2024-12-17 17:48 ` [PULL 04/17] migration/ram: Move RAM_SAVE_FLAG* into ram.h Fabiano Rosas
                   ` (14 subsequent siblings)
  17 siblings, 0 replies; 29+ messages in thread
From: Fabiano Rosas @ 2024-12-17 17:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu

From: Peter Xu <peterx@redhat.com>

Teach multifd_send_sync_main() to sync with threads only.

We already have such requests, which is when mapped-ram is enabled with
multifd.  In that case, no SYNC messages will be pushed to the stream when
multifd syncs the sender threads because there's no destination threads
waiting for that.  The whole point of the sync is to make sure all threads
finished their jobs.

So fundamentally we have a request to do the sync in different ways:

  - Either to sync the threads only,
  - Or to sync the threads but also with the destination side.

Mapped-ram did it already because of the use_packet check in the sync
handler of the sender thread.  It works.

However it may stop working when e.g. VFIO may start to reuse multifd
channels to push device states.  In that case VFIO has similar request on
"thread-only sync" however we can't check a flag because such sync request
can still come from RAM which needs the on-wire notifications.

Paving way for that by allowing the multifd_send_sync_main() to specify
what kind of sync the caller needs.  We can use it for mapped-ram already.

No functional change intended.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Message-Id: <20241206224755.1108686-3-peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 migration/multifd-nocomp.c |  7 ++++++-
 migration/multifd.c        | 17 +++++++++++------
 migration/multifd.h        | 23 ++++++++++++++++++++---
 3 files changed, 37 insertions(+), 10 deletions(-)

diff --git a/migration/multifd-nocomp.c b/migration/multifd-nocomp.c
index 55191152f9..219f9e58ef 100644
--- a/migration/multifd-nocomp.c
+++ b/migration/multifd-nocomp.c
@@ -345,6 +345,8 @@ retry:
 
 int multifd_ram_flush_and_sync(void)
 {
+    MultiFDSyncReq req;
+
     if (!migrate_multifd()) {
         return 0;
     }
@@ -356,7 +358,10 @@ int multifd_ram_flush_and_sync(void)
         }
     }
 
-    return multifd_send_sync_main();
+    /* File migrations only need to sync with threads */
+    req = migrate_mapped_ram() ? MULTIFD_SYNC_LOCAL : MULTIFD_SYNC_ALL;
+
+    return multifd_send_sync_main(req);
 }
 
 bool multifd_send_prepare_common(MultiFDSendParams *p)
diff --git a/migration/multifd.c b/migration/multifd.c
index 498e71fd10..7ecc3964ee 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -523,11 +523,13 @@ static int multifd_zero_copy_flush(QIOChannel *c)
     return ret;
 }
 
-int multifd_send_sync_main(void)
+int multifd_send_sync_main(MultiFDSyncReq req)
 {
     int i;
     bool flush_zero_copy;
 
+    assert(req != MULTIFD_SYNC_NONE);
+
     flush_zero_copy = migrate_zero_copy_send();
 
     for (i = 0; i < migrate_multifd_channels(); i++) {
@@ -543,8 +545,8 @@ int multifd_send_sync_main(void)
          * We should be the only user so far, so not possible to be set by
          * others concurrently.
          */
-        assert(qatomic_read(&p->pending_sync) == false);
-        qatomic_set(&p->pending_sync, true);
+        assert(qatomic_read(&p->pending_sync) == MULTIFD_SYNC_NONE);
+        qatomic_set(&p->pending_sync, req);
         qemu_sem_post(&p->sem);
     }
     for (i = 0; i < migrate_multifd_channels(); i++) {
@@ -635,14 +637,17 @@ static void *multifd_send_thread(void *opaque)
              */
             qatomic_store_release(&p->pending_job, false);
         } else {
+            MultiFDSyncReq req = qatomic_read(&p->pending_sync);
+
             /*
              * If not a normal job, must be a sync request.  Note that
              * pending_sync is a standalone flag (unlike pending_job), so
              * it doesn't require explicit memory barriers.
              */
-            assert(qatomic_read(&p->pending_sync));
+            assert(req != MULTIFD_SYNC_NONE);
 
-            if (use_packets) {
+            /* Only push the SYNC message if it involves a remote sync */
+            if (req == MULTIFD_SYNC_ALL) {
                 p->flags = MULTIFD_FLAG_SYNC;
                 multifd_send_fill_packet(p);
                 ret = qio_channel_write_all(p->c, (void *)p->packet,
@@ -654,7 +659,7 @@ static void *multifd_send_thread(void *opaque)
                 stat64_add(&mig_stats.multifd_bytes, p->packet_len);
             }
 
-            qatomic_set(&p->pending_sync, false);
+            qatomic_set(&p->pending_sync, MULTIFD_SYNC_NONE);
             qemu_sem_post(&p->sem_sync);
         }
     }
diff --git a/migration/multifd.h b/migration/multifd.h
index 50d58c0c9c..6493512305 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -19,6 +19,22 @@
 typedef struct MultiFDRecvData MultiFDRecvData;
 typedef struct MultiFDSendData MultiFDSendData;
 
+typedef enum {
+    /* No sync request */
+    MULTIFD_SYNC_NONE = 0,
+    /* Sync locally on the sender threads without pushing messages */
+    MULTIFD_SYNC_LOCAL,
+    /*
+     * Sync not only on the sender threads, but also push MULTIFD_FLAG_SYNC
+     * message to the wire for each iochannel (which is for a remote sync).
+     *
+     * When remote sync is used, need to be paired with a follow up
+     * RAM_SAVE_FLAG_EOS / RAM_SAVE_FLAG_MULTIFD_FLUSH message on the main
+     * channel.
+     */
+    MULTIFD_SYNC_ALL,
+} MultiFDSyncReq;
+
 bool multifd_send_setup(void);
 void multifd_send_shutdown(void);
 void multifd_send_channel_created(void);
@@ -28,7 +44,7 @@ void multifd_recv_shutdown(void);
 bool multifd_recv_all_channels_created(void);
 void multifd_recv_new_channel(QIOChannel *ioc, Error **errp);
 void multifd_recv_sync_main(void);
-int multifd_send_sync_main(void);
+int multifd_send_sync_main(MultiFDSyncReq req);
 bool multifd_queue_page(RAMBlock *block, ram_addr_t offset);
 bool multifd_recv(void);
 MultiFDRecvData *multifd_get_recv_data(void);
@@ -143,7 +159,7 @@ typedef struct {
     /* multifd flags for each packet */
     uint32_t flags;
     /*
-     * The sender thread has work to do if either of below boolean is set.
+     * The sender thread has work to do if either of below field is set.
      *
      * @pending_job:  a job is pending
      * @pending_sync: a sync request is pending
@@ -152,7 +168,8 @@ typedef struct {
      * cleared by the multifd sender threads.
      */
     bool pending_job;
-    bool pending_sync;
+    MultiFDSyncReq pending_sync;
+
     MultiFDSendData *data;
 
     /* thread local variables. No locking required */
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PULL 04/17] migration/ram: Move RAM_SAVE_FLAG* into ram.h
  2024-12-17 17:48 [PULL 00/17] Migration patches for 2024-12-17 Fabiano Rosas
                   ` (2 preceding siblings ...)
  2024-12-17 17:48 ` [PULL 03/17] migration/multifd: Allow to sync with sender threads only Fabiano Rosas
@ 2024-12-17 17:48 ` Fabiano Rosas
  2024-12-17 17:48 ` [PULL 05/17] migration/multifd: Unify RAM_SAVE_FLAG_MULTIFD_FLUSH messages Fabiano Rosas
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 29+ messages in thread
From: Fabiano Rosas @ 2024-12-17 17:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu

From: Peter Xu <peterx@redhat.com>

Firstly, we're going to use the multifd flag soon in multifd code, so ram.c
isn't gonna work.

Secondly, we have a separate RDMA flag dangling around, which is definitely
not obvious.  There's one comment that helps, but not too much.

Put all RAM save flags altogether, so nothing will get overlooked.

Add a section explain why we can't use bits over 0x200.

Remove RAM_SAVE_FLAG_FULL as it's already not used in QEMU, as the comment
explained.

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20241206224755.1108686-4-peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 migration/ram.c  | 21 ---------------------
 migration/ram.h  | 28 ++++++++++++++++++++++++++++
 migration/rdma.h |  7 -------
 3 files changed, 28 insertions(+), 28 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 7284c34bd8..44010ff325 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -71,27 +71,6 @@
 /***********************************************************/
 /* ram save/restore */
 
-/*
- * RAM_SAVE_FLAG_ZERO used to be named RAM_SAVE_FLAG_COMPRESS, it
- * worked for pages that were filled with the same char.  We switched
- * it to only search for the zero value.  And to avoid confusion with
- * RAM_SAVE_FLAG_COMPRESS_PAGE just rename it.
- *
- * RAM_SAVE_FLAG_FULL was obsoleted in 2009.
- *
- * RAM_SAVE_FLAG_COMPRESS_PAGE (0x100) was removed in QEMU 9.1.
- */
-#define RAM_SAVE_FLAG_FULL     0x01
-#define RAM_SAVE_FLAG_ZERO     0x02
-#define RAM_SAVE_FLAG_MEM_SIZE 0x04
-#define RAM_SAVE_FLAG_PAGE     0x08
-#define RAM_SAVE_FLAG_EOS      0x10
-#define RAM_SAVE_FLAG_CONTINUE 0x20
-#define RAM_SAVE_FLAG_XBZRLE   0x40
-/* 0x80 is reserved in rdma.h for RAM_SAVE_FLAG_HOOK */
-#define RAM_SAVE_FLAG_MULTIFD_FLUSH    0x200
-/* We can't use any flag that is bigger than 0x200 */
-
 /*
  * mapped-ram migration supports O_DIRECT, so we need to make sure the
  * userspace buffer, the IO operation size and the file offset are
diff --git a/migration/ram.h b/migration/ram.h
index 0d1981f888..921c39a2c5 100644
--- a/migration/ram.h
+++ b/migration/ram.h
@@ -33,6 +33,34 @@
 #include "exec/cpu-common.h"
 #include "io/channel.h"
 
+/*
+ * RAM_SAVE_FLAG_ZERO used to be named RAM_SAVE_FLAG_COMPRESS, it
+ * worked for pages that were filled with the same char.  We switched
+ * it to only search for the zero value.  And to avoid confusion with
+ * RAM_SAVE_FLAG_COMPRESS_PAGE just rename it.
+ *
+ * RAM_SAVE_FLAG_FULL (0x01) was obsoleted in 2009.
+ *
+ * RAM_SAVE_FLAG_COMPRESS_PAGE (0x100) was removed in QEMU 9.1.
+ *
+ * RAM_SAVE_FLAG_HOOK is only used in RDMA. Whenever this is found in the
+ * data stream, the flags will be passed to rdma functions in the
+ * incoming-migration side.
+ *
+ * We can't use any flag that is bigger than 0x200, because the flags are
+ * always assumed to be encoded in a ramblock address offset, which is
+ * multiple of PAGE_SIZE.  Here it means QEMU supports migration with any
+ * architecture that has PAGE_SIZE>=1K (0x400).
+ */
+#define RAM_SAVE_FLAG_ZERO                    0x002
+#define RAM_SAVE_FLAG_MEM_SIZE                0x004
+#define RAM_SAVE_FLAG_PAGE                    0x008
+#define RAM_SAVE_FLAG_EOS                     0x010
+#define RAM_SAVE_FLAG_CONTINUE                0x020
+#define RAM_SAVE_FLAG_XBZRLE                  0x040
+#define RAM_SAVE_FLAG_HOOK                    0x080
+#define RAM_SAVE_FLAG_MULTIFD_FLUSH           0x200
+
 extern XBZRLECacheStats xbzrle_counters;
 
 /* Should be holding either ram_list.mutex, or the RCU lock. */
diff --git a/migration/rdma.h b/migration/rdma.h
index a8d27f33b8..f55f28bbed 100644
--- a/migration/rdma.h
+++ b/migration/rdma.h
@@ -33,13 +33,6 @@ void rdma_start_incoming_migration(InetSocketAddress *host_port, Error **errp);
 #define RAM_CONTROL_ROUND     1
 #define RAM_CONTROL_FINISH    3
 
-/*
- * Whenever this is found in the data stream, the flags
- * will be passed to rdma functions in the incoming-migration
- * side.
- */
-#define RAM_SAVE_FLAG_HOOK     0x80
-
 #define RAM_SAVE_CONTROL_NOT_SUPP -1000
 #define RAM_SAVE_CONTROL_DELAYED  -2000
 
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PULL 05/17] migration/multifd: Unify RAM_SAVE_FLAG_MULTIFD_FLUSH messages
  2024-12-17 17:48 [PULL 00/17] Migration patches for 2024-12-17 Fabiano Rosas
                   ` (3 preceding siblings ...)
  2024-12-17 17:48 ` [PULL 04/17] migration/ram: Move RAM_SAVE_FLAG* into ram.h Fabiano Rosas
@ 2024-12-17 17:48 ` Fabiano Rosas
  2024-12-17 17:48 ` [PULL 06/17] migration/multifd: Remove sync processing on postcopy Fabiano Rosas
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 29+ messages in thread
From: Fabiano Rosas @ 2024-12-17 17:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu

From: Peter Xu <peterx@redhat.com>

RAM_SAVE_FLAG_MULTIFD_FLUSH message should always be correlated to a sync
request on src.  Unify such message into one place, and conditionally send
the message only if necessary.

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20241206224755.1108686-5-peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 migration/multifd-nocomp.c | 27 +++++++++++++++++++++++++--
 migration/multifd.h        |  2 +-
 migration/ram.c            | 18 ++++--------------
 3 files changed, 30 insertions(+), 17 deletions(-)

diff --git a/migration/multifd-nocomp.c b/migration/multifd-nocomp.c
index 219f9e58ef..58372db0f4 100644
--- a/migration/multifd-nocomp.c
+++ b/migration/multifd-nocomp.c
@@ -20,6 +20,7 @@
 #include "qemu/cutils.h"
 #include "qemu/error-report.h"
 #include "trace.h"
+#include "qemu-file.h"
 
 static MultiFDSendData *multifd_ram_send;
 
@@ -343,9 +344,10 @@ retry:
     return true;
 }
 
-int multifd_ram_flush_and_sync(void)
+int multifd_ram_flush_and_sync(QEMUFile *f)
 {
     MultiFDSyncReq req;
+    int ret;
 
     if (!migrate_multifd()) {
         return 0;
@@ -361,7 +363,28 @@ int multifd_ram_flush_and_sync(void)
     /* File migrations only need to sync with threads */
     req = migrate_mapped_ram() ? MULTIFD_SYNC_LOCAL : MULTIFD_SYNC_ALL;
 
-    return multifd_send_sync_main(req);
+    ret = multifd_send_sync_main(req);
+    if (ret) {
+        return ret;
+    }
+
+    /* If we don't need to sync with remote at all, nothing else to do */
+    if (req == MULTIFD_SYNC_LOCAL) {
+        return 0;
+    }
+
+    /*
+     * Old QEMUs don't understand RAM_SAVE_FLAG_MULTIFD_FLUSH, it relies
+     * on RAM_SAVE_FLAG_EOS instead.
+     */
+    if (migrate_multifd_flush_after_each_section()) {
+        return 0;
+    }
+
+    qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
+    qemu_fflush(f);
+
+    return 0;
 }
 
 bool multifd_send_prepare_common(MultiFDSendParams *p)
diff --git a/migration/multifd.h b/migration/multifd.h
index 6493512305..0fef431f6b 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -354,7 +354,7 @@ static inline uint32_t multifd_ram_page_count(void)
 
 void multifd_ram_save_setup(void);
 void multifd_ram_save_cleanup(void);
-int multifd_ram_flush_and_sync(void);
+int multifd_ram_flush_and_sync(QEMUFile *f);
 size_t multifd_ram_payload_size(void);
 void multifd_ram_fill_packet(MultiFDSendParams *p);
 int multifd_ram_unfill_packet(MultiFDRecvParams *p, Error **errp);
diff --git a/migration/ram.c b/migration/ram.c
index 44010ff325..90811aabd4 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1306,15 +1306,10 @@ static int find_dirty_block(RAMState *rs, PageSearchStatus *pss)
                 (!migrate_multifd_flush_after_each_section() ||
                  migrate_mapped_ram())) {
                 QEMUFile *f = rs->pss[RAM_CHANNEL_PRECOPY].pss_channel;
-                int ret = multifd_ram_flush_and_sync();
+                int ret = multifd_ram_flush_and_sync(f);
                 if (ret < 0) {
                     return ret;
                 }
-
-                if (!migrate_mapped_ram()) {
-                    qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
-                    qemu_fflush(f);
-                }
             }
 
             /* Hit the end of the list */
@@ -3044,18 +3039,13 @@ static int ram_save_setup(QEMUFile *f, void *opaque, Error **errp)
     }
 
     bql_unlock();
-    ret = multifd_ram_flush_and_sync();
+    ret = multifd_ram_flush_and_sync(f);
     bql_lock();
     if (ret < 0) {
         error_setg(errp, "%s: multifd synchronization failed", __func__);
         return ret;
     }
 
-    if (migrate_multifd() && !migrate_multifd_flush_after_each_section()
-        && !migrate_mapped_ram()) {
-        qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
-    }
-
     qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
     ret = qemu_fflush(f);
     if (ret < 0) {
@@ -3190,7 +3180,7 @@ out:
     if (ret >= 0 && migration_is_running()) {
         if (migrate_multifd() && migrate_multifd_flush_after_each_section() &&
             !migrate_mapped_ram()) {
-            ret = multifd_ram_flush_and_sync();
+            ret = multifd_ram_flush_and_sync(f);
             if (ret < 0) {
                 return ret;
             }
@@ -3268,7 +3258,7 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
          * Only the old dest QEMU will need this sync, because each EOS
          * will require one SYNC message on each channel.
          */
-        ret = multifd_ram_flush_and_sync();
+        ret = multifd_ram_flush_and_sync(f);
         if (ret < 0) {
             return ret;
         }
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PULL 06/17] migration/multifd: Remove sync processing on postcopy
  2024-12-17 17:48 [PULL 00/17] Migration patches for 2024-12-17 Fabiano Rosas
                   ` (4 preceding siblings ...)
  2024-12-17 17:48 ` [PULL 05/17] migration/multifd: Unify RAM_SAVE_FLAG_MULTIFD_FLUSH messages Fabiano Rosas
@ 2024-12-17 17:48 ` Fabiano Rosas
  2024-12-17 17:48 ` [PULL 07/17] migration/multifd: Cleanup src flushes on condition check Fabiano Rosas
                   ` (11 subsequent siblings)
  17 siblings, 0 replies; 29+ messages in thread
From: Fabiano Rosas @ 2024-12-17 17:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu

From: Peter Xu <peterx@redhat.com>

Multifd never worked with postcopy, at least yet so far.

Remove the sync processing there, because it's confusing, and they should
never appear.  Now if RAM_SAVE_FLAG_MULTIFD_FLUSH is observed, we fail hard
instead of trying to invoke multifd code.

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20241206224755.1108686-6-peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 migration/ram.c | 8 --------
 1 file changed, 8 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 90811aabd4..154ff5abd4 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -3772,15 +3772,7 @@ int ram_load_postcopy(QEMUFile *f, int channel)
                                          TARGET_PAGE_SIZE);
             }
             break;
-        case RAM_SAVE_FLAG_MULTIFD_FLUSH:
-            multifd_recv_sync_main();
-            break;
         case RAM_SAVE_FLAG_EOS:
-            /* normal exit */
-            if (migrate_multifd() &&
-                migrate_multifd_flush_after_each_section()) {
-                multifd_recv_sync_main();
-            }
             break;
         default:
             error_report("Unknown combination of migration flags: 0x%x"
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PULL 07/17] migration/multifd: Cleanup src flushes on condition check
  2024-12-17 17:48 [PULL 00/17] Migration patches for 2024-12-17 Fabiano Rosas
                   ` (5 preceding siblings ...)
  2024-12-17 17:48 ` [PULL 06/17] migration/multifd: Remove sync processing on postcopy Fabiano Rosas
@ 2024-12-17 17:48 ` Fabiano Rosas
  2024-12-17 17:48 ` [PULL 08/17] migration/multifd: Document the reason to sync for save_setup() Fabiano Rosas
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 29+ messages in thread
From: Fabiano Rosas @ 2024-12-17 17:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu

From: Peter Xu <peterx@redhat.com>

The src flush condition check is over complicated, and it's getting more
out of control if postcopy will be involved.

In general, we have two modes to do the sync: legacy or modern ways.
Legacy uses per-section flush, modern uses per-round flush.

Mapped-ram always uses the modern, which is per-round.

Introduce two helpers, which can greatly simplify the code, and hopefully
make it readable again.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Message-Id: <20241206224755.1108686-7-peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 migration/multifd-nocomp.c | 42 ++++++++++++++++++++++++++++++++++++++
 migration/multifd.h        |  2 ++
 migration/ram.c            | 10 +++------
 3 files changed, 47 insertions(+), 7 deletions(-)

diff --git a/migration/multifd-nocomp.c b/migration/multifd-nocomp.c
index 58372db0f4..c1f686c0ce 100644
--- a/migration/multifd-nocomp.c
+++ b/migration/multifd-nocomp.c
@@ -344,6 +344,48 @@ retry:
     return true;
 }
 
+/*
+ * We have two modes for multifd flushes:
+ *
+ * - Per-section mode: this is the legacy way to flush, it requires one
+ *   MULTIFD_FLAG_SYNC message for each RAM_SAVE_FLAG_EOS.
+ *
+ * - Per-round mode: this is the modern way to flush, it requires one
+ *   MULTIFD_FLAG_SYNC message only for each round of RAM scan.  Normally
+ *   it's paired with a new RAM_SAVE_FLAG_MULTIFD_FLUSH message in network
+ *   based migrations.
+ *
+ * One thing to mention is mapped-ram always use the modern way to sync.
+ */
+
+/* Do we need a per-section multifd flush (legacy way)? */
+bool multifd_ram_sync_per_section(void)
+{
+    if (!migrate_multifd()) {
+        return false;
+    }
+
+    if (migrate_mapped_ram()) {
+        return false;
+    }
+
+    return migrate_multifd_flush_after_each_section();
+}
+
+/* Do we need a per-round multifd flush (modern way)? */
+bool multifd_ram_sync_per_round(void)
+{
+    if (!migrate_multifd()) {
+        return false;
+    }
+
+    if (migrate_mapped_ram()) {
+        return true;
+    }
+
+    return !migrate_multifd_flush_after_each_section();
+}
+
 int multifd_ram_flush_and_sync(QEMUFile *f)
 {
     MultiFDSyncReq req;
diff --git a/migration/multifd.h b/migration/multifd.h
index 0fef431f6b..bd785b9873 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -355,6 +355,8 @@ static inline uint32_t multifd_ram_page_count(void)
 void multifd_ram_save_setup(void);
 void multifd_ram_save_cleanup(void);
 int multifd_ram_flush_and_sync(QEMUFile *f);
+bool multifd_ram_sync_per_round(void);
+bool multifd_ram_sync_per_section(void);
 size_t multifd_ram_payload_size(void);
 void multifd_ram_fill_packet(MultiFDSendParams *p);
 int multifd_ram_unfill_packet(MultiFDRecvParams *p, Error **errp);
diff --git a/migration/ram.c b/migration/ram.c
index 154ff5abd4..5d4bdefe69 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1302,9 +1302,7 @@ static int find_dirty_block(RAMState *rs, PageSearchStatus *pss)
         pss->page = 0;
         pss->block = QLIST_NEXT_RCU(pss->block, next);
         if (!pss->block) {
-            if (migrate_multifd() &&
-                (!migrate_multifd_flush_after_each_section() ||
-                 migrate_mapped_ram())) {
+            if (multifd_ram_sync_per_round()) {
                 QEMUFile *f = rs->pss[RAM_CHANNEL_PRECOPY].pss_channel;
                 int ret = multifd_ram_flush_and_sync(f);
                 if (ret < 0) {
@@ -3178,8 +3176,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
 
 out:
     if (ret >= 0 && migration_is_running()) {
-        if (migrate_multifd() && migrate_multifd_flush_after_each_section() &&
-            !migrate_mapped_ram()) {
+        if (multifd_ram_sync_per_section()) {
             ret = multifd_ram_flush_and_sync(f);
             if (ret < 0) {
                 return ret;
@@ -3252,8 +3249,7 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
         }
     }
 
-    if (migrate_multifd() &&
-        migrate_multifd_flush_after_each_section()) {
+    if (multifd_ram_sync_per_section()) {
         /*
          * Only the old dest QEMU will need this sync, because each EOS
          * will require one SYNC message on each channel.
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PULL 08/17] migration/multifd: Document the reason to sync for save_setup()
  2024-12-17 17:48 [PULL 00/17] Migration patches for 2024-12-17 Fabiano Rosas
                   ` (6 preceding siblings ...)
  2024-12-17 17:48 ` [PULL 07/17] migration/multifd: Cleanup src flushes on condition check Fabiano Rosas
@ 2024-12-17 17:48 ` Fabiano Rosas
  2024-12-17 17:48 ` [PULL 09/17] migration/multifd: Fix compat with QEMU < 9.0 Fabiano Rosas
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 29+ messages in thread
From: Fabiano Rosas @ 2024-12-17 17:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu

From: Peter Xu <peterx@redhat.com>

It's not straightforward to see why src QEMU needs to sync multifd during
setup() phase.  After all, there's no page queued at that point.

For old QEMUs, there's a solid reason: EOS requires it to work.  While it's
clueless on the new QEMUs which do not take EOS message as sync requests.

One will figure that out only when this is conditionally removed.  In fact,
the author did try it out.  Logically we could still avoid doing this on
new machine types, however that needs a separate compat field and that can
be an overkill in some trivial overhead in setup() phase.

Let's instead document it completely, to avoid someone else tries this
again and do the debug one more time, or anyone confused on why this ever
existed.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Message-Id: <20241206224755.1108686-8-peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 migration/ram.c | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/migration/ram.c b/migration/ram.c
index 5d4bdefe69..e5c590b259 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -3036,6 +3036,31 @@ static int ram_save_setup(QEMUFile *f, void *opaque, Error **errp)
         migration_ops->ram_save_target_page = ram_save_target_page_legacy;
     }
 
+    /*
+     * This operation is unfortunate..
+     *
+     * For legacy QEMUs using per-section sync
+     * =======================================
+     *
+     * This must exist because the EOS below requires the SYNC messages
+     * per-channel to work.
+     *
+     * For modern QEMUs using per-round sync
+     * =====================================
+     *
+     * Logically such sync is not needed, and recv threads should not run
+     * until setup ready (using things like channels_ready on src).  Then
+     * we should be all fine.
+     *
+     * However even if we add channels_ready to recv side in new QEMUs, old
+     * QEMU won't have them so this sync will still be needed to make sure
+     * multifd recv threads won't start processing guest pages early before
+     * ram_load_setup() is properly done.
+     *
+     * Let's stick with this.  Fortunately the overhead is low to sync
+     * during setup because the VM is running, so at least it's not
+     * accounted as part of downtime.
+     */
     bql_unlock();
     ret = multifd_ram_flush_and_sync(f);
     bql_lock();
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PULL 09/17] migration/multifd: Fix compat with QEMU < 9.0
  2024-12-17 17:48 [PULL 00/17] Migration patches for 2024-12-17 Fabiano Rosas
                   ` (7 preceding siblings ...)
  2024-12-17 17:48 ` [PULL 08/17] migration/multifd: Document the reason to sync for save_setup() Fabiano Rosas
@ 2024-12-17 17:48 ` Fabiano Rosas
  2024-12-17 17:48 ` [PULL 10/17] s390x: Fix CSS migration Fabiano Rosas
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 29+ messages in thread
From: Fabiano Rosas @ 2024-12-17 17:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu, qemu-stable

Commit f5f48a7891 ("migration/multifd: Separate SYNC request with
normal jobs") changed the multifd source side to stop sending data
along with the MULTIFD_FLAG_SYNC, effectively introducing the concept
of a SYNC-only packet. Relying on that, commit d7e58f412c
("migration/multifd: Don't send ram data during SYNC") later came
along and skipped reading data from SYNC packets.

In a versions timeline like this:

  8.2 f5f48a7 9.0 9.1 d7e58f41 9.2

The issue arises that QEMUs < 9.0 still send data along with SYNC, but
QEMUs > 9.1 don't gather that data anymore. This leads to various
kinds of migration failures due to desync/missing data.

Stop checking for a SYNC packet on the destination and unconditionally
unfill the packet.

>From now on:

old -> new:
the source sends data + sync, destination reads normally

new -> new:
source sends only sync, destination reads zeros

new -> old:
source sends only sync, destination reads zeros

CC: qemu-stable@nongnu.org
Fixes: d7e58f412c ("migration/multifd: Don't send ram data during SYNC")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2720
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Message-Id: <20241213160120.23880-2-farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 migration/multifd.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/migration/multifd.c b/migration/multifd.c
index 7ecc3964ee..e98af0f116 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -252,9 +252,8 @@ static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp)
     p->packet_num = be64_to_cpu(packet->packet_num);
     p->packets_recved++;
 
-    if (!(p->flags & MULTIFD_FLAG_SYNC)) {
-        ret = multifd_ram_unfill_packet(p, errp);
-    }
+    /* Always unfill, old QEMUs (<9.0) send data along with SYNC */
+    ret = multifd_ram_unfill_packet(p, errp);
 
     trace_multifd_recv_unfill(p->id, p->packet_num, p->flags,
                               p->next_packet_size);
@@ -1156,9 +1155,13 @@ static void *multifd_recv_thread(void *opaque)
             flags = p->flags;
             /* recv methods don't know how to handle the SYNC flag */
             p->flags &= ~MULTIFD_FLAG_SYNC;
-            if (!(flags & MULTIFD_FLAG_SYNC)) {
-                has_data = p->normal_num || p->zero_num;
-            }
+
+            /*
+             * Even if it's a SYNC packet, this needs to be set
+             * because older QEMUs (<9.0) still send data along with
+             * the SYNC packet.
+             */
+            has_data = p->normal_num || p->zero_num;
             qemu_mutex_unlock(&p->mutex);
         } else {
             /*
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PULL 10/17] s390x: Fix CSS migration
  2024-12-17 17:48 [PULL 00/17] Migration patches for 2024-12-17 Fabiano Rosas
                   ` (8 preceding siblings ...)
  2024-12-17 17:48 ` [PULL 09/17] migration/multifd: Fix compat with QEMU < 9.0 Fabiano Rosas
@ 2024-12-17 17:48 ` Fabiano Rosas
  2024-12-17 17:48 ` [PULL 11/17] migration: Add helper to get target runstate Fabiano Rosas
                   ` (7 subsequent siblings)
  17 siblings, 0 replies; 29+ messages in thread
From: Fabiano Rosas @ 2024-12-17 17:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu, Paolo Bonzini, qemu-stable, Thomas Huth

Commit a55ae46683 ("s390: move css_migration_enabled from machine to
css.c") disabled CSS migration globally instead of doing it
per-instance.

CC: Paolo Bonzini <pbonzini@redhat.com>
CC: qemu-stable@nongnu.org #9.1
Fixes: a55ae46683 ("s390: move css_migration_enabled from machine to css.c")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2704
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Message-Id: <20241213160120.23880-3-farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 hw/s390x/s390-virtio-ccw.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index 67ae34aead..4db73fc6f2 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -1201,6 +1201,7 @@ static void ccw_machine_2_9_instance_options(MachineState *machine)
     s390_cpudef_featoff_greater(12, 1, S390_FEAT_ZPCI);
     s390_cpudef_featoff_greater(12, 1, S390_FEAT_ADAPTER_INT_SUPPRESSION);
     s390_cpudef_featoff_greater(12, 1, S390_FEAT_ADAPTER_EVENT_NOTIFICATION);
+    css_migration_enabled = false;
 }
 
 static void ccw_machine_2_9_class_options(MachineClass *mc)
@@ -1213,7 +1214,6 @@ static void ccw_machine_2_9_class_options(MachineClass *mc)
     ccw_machine_2_10_class_options(mc);
     compat_props_add(mc->compat_props, hw_compat_2_9, hw_compat_2_9_len);
     compat_props_add(mc->compat_props, compat, G_N_ELEMENTS(compat));
-    css_migration_enabled = false;
 }
 DEFINE_CCW_MACHINE(2, 9);
 
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PULL 11/17] migration: Add helper to get target runstate
  2024-12-17 17:48 [PULL 00/17] Migration patches for 2024-12-17 Fabiano Rosas
                   ` (9 preceding siblings ...)
  2024-12-17 17:48 ` [PULL 10/17] s390x: Fix CSS migration Fabiano Rosas
@ 2024-12-17 17:48 ` Fabiano Rosas
  2024-12-17 17:48 ` [PULL 12/17] qmp/cont: Only activate disks if migration completed Fabiano Rosas
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 29+ messages in thread
From: Fabiano Rosas @ 2024-12-17 17:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu

From: Peter Xu <peterx@redhat.com>

In 99% cases, after QEMU migrates to dest host, it tries to detect the
target VM runstate using global_state_get_runstate().

There's one outlier so far which is Xen that won't send global state.
That's the major reason why global_state_received() check was always there
together with global_state_get_runstate().

However it's utterly confusing why global_state_received() has anything to
do with "let's start VM or not".

Provide a helper to explain it, then we have an unified entry for getting
the target dest QEMU runstate after migration.

Suggested-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20241206230838.1111496-2-peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 migration/migration.c | 21 +++++++++++++++++----
 1 file changed, 17 insertions(+), 4 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 8c5bd0a75c..d2a6b939cf 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -135,6 +135,21 @@ static bool migration_needs_multiple_sockets(void)
     return migrate_multifd() || migrate_postcopy_preempt();
 }
 
+static RunState migration_get_target_runstate(void)
+{
+    /*
+     * When the global state is not migrated, it means we don't know the
+     * runstate of the src QEMU.  We don't have much choice but assuming
+     * the VM is running.  NOTE: this is pretty rare case, so far only Xen
+     * uses it.
+     */
+    if (!global_state_received()) {
+        return RUN_STATE_RUNNING;
+    }
+
+    return global_state_get_runstate();
+}
+
 static bool transport_supports_multi_channels(MigrationAddress *addr)
 {
     if (addr->transport == MIGRATION_ADDRESS_TYPE_SOCKET) {
@@ -735,8 +750,7 @@ static void process_incoming_migration_bh(void *opaque)
      * unless we really are starting the VM.
      */
     if (!migrate_late_block_activate() ||
-         (autostart && (!global_state_received() ||
-            runstate_is_live(global_state_get_runstate())))) {
+        (autostart && runstate_is_live(migration_get_target_runstate()))) {
         /* Make sure all file formats throw away their mutable metadata.
          * If we get an error here, just don't restart the VM yet. */
         bdrv_activate_all(&local_err);
@@ -759,8 +773,7 @@ static void process_incoming_migration_bh(void *opaque)
 
     dirty_bitmap_mig_before_vm_start();
 
-    if (!global_state_received() ||
-        runstate_is_live(global_state_get_runstate())) {
+    if (runstate_is_live(migration_get_target_runstate())) {
         if (autostart) {
             vm_start();
         } else {
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PULL 12/17] qmp/cont: Only activate disks if migration completed
  2024-12-17 17:48 [PULL 00/17] Migration patches for 2024-12-17 Fabiano Rosas
                   ` (10 preceding siblings ...)
  2024-12-17 17:48 ` [PULL 11/17] migration: Add helper to get target runstate Fabiano Rosas
@ 2024-12-17 17:48 ` Fabiano Rosas
  2024-12-17 17:48 ` [PULL 13/17] migration/block: Make late-block-active the default Fabiano Rosas
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 29+ messages in thread
From: Fabiano Rosas @ 2024-12-17 17:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu, Kevin Wolf, Markus Armbruster

From: Peter Xu <peterx@redhat.com>

As the comment says, the activation of disks is for the case where
migration has completed, rather than when QEMU is still during
migration (RUN_STATE_INMIGRATE).

Move the code over to reflect what the comment is describing.

Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Message-Id: <20241206230838.1111496-3-peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 monitor/qmp-cmds.c | 26 ++++++++++++++------------
 1 file changed, 14 insertions(+), 12 deletions(-)

diff --git a/monitor/qmp-cmds.c b/monitor/qmp-cmds.c
index f84a0dc523..76f21e8af3 100644
--- a/monitor/qmp-cmds.c
+++ b/monitor/qmp-cmds.c
@@ -96,21 +96,23 @@ void qmp_cont(Error **errp)
         }
     }
 
-    /* Continuing after completed migration. Images have been inactivated to
-     * allow the destination to take control. Need to get control back now.
-     *
-     * If there are no inactive block nodes (e.g. because the VM was just
-     * paused rather than completing a migration), bdrv_inactivate_all() simply
-     * doesn't do anything. */
-    bdrv_activate_all(&local_err);
-    if (local_err) {
-        error_propagate(errp, local_err);
-        return;
-    }
-
     if (runstate_check(RUN_STATE_INMIGRATE)) {
         autostart = 1;
     } else {
+        /*
+         * Continuing after completed migration. Images have been
+         * inactivated to allow the destination to take control. Need to
+         * get control back now.
+         *
+         * If there are no inactive block nodes (e.g. because the VM was
+         * just paused rather than completing a migration),
+         * bdrv_inactivate_all() simply doesn't do anything.
+         */
+        bdrv_activate_all(&local_err);
+        if (local_err) {
+            error_propagate(errp, local_err);
+            return;
+        }
         vm_start();
     }
 }
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PULL 13/17] migration/block: Make late-block-active the default
  2024-12-17 17:48 [PULL 00/17] Migration patches for 2024-12-17 Fabiano Rosas
                   ` (11 preceding siblings ...)
  2024-12-17 17:48 ` [PULL 12/17] qmp/cont: Only activate disks if migration completed Fabiano Rosas
@ 2024-12-17 17:48 ` Fabiano Rosas
  2024-12-17 17:48 ` [PULL 14/17] migration/block: Apply late-block-active behavior to postcopy Fabiano Rosas
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 29+ messages in thread
From: Fabiano Rosas @ 2024-12-17 17:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu

From: Peter Xu <peterx@redhat.com>

Migration capability 'late-block-active' controls when the block drives
will be activated.  If enabled, block drives will only be activated until
VM starts, either src runstate was "live" (RUNNING, or SUSPENDED), or it'll
be postponed until qmp_cont().

Let's do this unconditionally.  There's no harm to delay activation of
block drives.  Meanwhile there's no ABI breakage if dest does it, because
src QEMU has nothing to do with it, so it's no concern on ABI breakage.

IIUC we could avoid introducing this cap when introducing it before, but
now it's still not too late to just always do it.  Cap now prone to
removal, but it'll be for later patches.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Message-Id: <20241206230838.1111496-4-peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 migration/migration.c | 38 +++++++++++++++++++-------------------
 1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index d2a6b939cf..e6db9cfc50 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -743,24 +743,6 @@ static void process_incoming_migration_bh(void *opaque)
 
     trace_vmstate_downtime_checkpoint("dst-precopy-bh-enter");
 
-    /* If capability late_block_activate is set:
-     * Only fire up the block code now if we're going to restart the
-     * VM, else 'cont' will do it.
-     * This causes file locking to happen; so we don't want it to happen
-     * unless we really are starting the VM.
-     */
-    if (!migrate_late_block_activate() ||
-        (autostart && runstate_is_live(migration_get_target_runstate()))) {
-        /* Make sure all file formats throw away their mutable metadata.
-         * If we get an error here, just don't restart the VM yet. */
-        bdrv_activate_all(&local_err);
-        if (local_err) {
-            error_report_err(local_err);
-            local_err = NULL;
-            autostart = false;
-        }
-    }
-
     /*
      * This must happen after all error conditions are dealt with and
      * we're sure the VM is going to be running on this host.
@@ -775,7 +757,25 @@ static void process_incoming_migration_bh(void *opaque)
 
     if (runstate_is_live(migration_get_target_runstate())) {
         if (autostart) {
-            vm_start();
+            /*
+             * Block activation is always delayed until VM starts, either
+             * here (which means we need to start the dest VM right now..),
+             * or until qmp_cont() later.
+             *
+             * We used to have cap 'late-block-activate' but now we do this
+             * unconditionally, as it has no harm but only benefit.  E.g.,
+             * it's not part of migration ABI on the time of disk activation.
+             *
+             * Make sure all file formats throw away their mutable
+             * metadata.  If error, don't restart the VM yet.
+             */
+            bdrv_activate_all(&local_err);
+            if (local_err) {
+                error_report_err(local_err);
+                local_err = NULL;
+            } else {
+                vm_start();
+            }
         } else {
             runstate_set(RUN_STATE_PAUSED);
         }
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PULL 14/17] migration/block: Apply late-block-active behavior to postcopy
  2024-12-17 17:48 [PULL 00/17] Migration patches for 2024-12-17 Fabiano Rosas
                   ` (12 preceding siblings ...)
  2024-12-17 17:48 ` [PULL 13/17] migration/block: Make late-block-active the default Fabiano Rosas
@ 2024-12-17 17:48 ` Fabiano Rosas
  2024-12-17 17:48 ` [PULL 15/17] migration/block: Fix possible race with block_inactive Fabiano Rosas
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 29+ messages in thread
From: Fabiano Rosas @ 2024-12-17 17:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu

From: Peter Xu <peterx@redhat.com>

Postcopy never cared about late-block-active.  However there's no mention
in the capability that it doesn't apply to postcopy.

Considering that we _assumed_ late activation is always good, do that too
for postcopy unconditionally, just like precopy.  After this patch, we
should have unified the behavior across all.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Message-Id: <20241206230838.1111496-5-peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 migration/savevm.c | 25 ++++++++++++-------------
 1 file changed, 12 insertions(+), 13 deletions(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index 98821c8120..80726726fe 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2137,22 +2137,21 @@ static void loadvm_postcopy_handle_run_bh(void *opaque)
 
     trace_vmstate_downtime_checkpoint("dst-postcopy-bh-announced");
 
-    /* Make sure all file formats throw away their mutable metadata.
-     * If we get an error here, just don't restart the VM yet. */
-    bdrv_activate_all(&local_err);
-    if (local_err) {
-        error_report_err(local_err);
-        local_err = NULL;
-        autostart = false;
-    }
-
-    trace_vmstate_downtime_checkpoint("dst-postcopy-bh-cache-invalidated");
-
     dirty_bitmap_mig_before_vm_start();
 
     if (autostart) {
-        /* Hold onto your hats, starting the CPU */
-        vm_start();
+        /*
+         * Make sure all file formats throw away their mutable metadata.
+         * If we get an error here, just don't restart the VM yet.
+         */
+        bdrv_activate_all(&local_err);
+        trace_vmstate_downtime_checkpoint("dst-postcopy-bh-cache-invalidated");
+        if (local_err) {
+            error_report_err(local_err);
+            local_err = NULL;
+        } else {
+            vm_start();
+        }
     } else {
         /* leave it paused and let management decide when to start the CPU */
         runstate_set(RUN_STATE_PAUSED);
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PULL 15/17] migration/block: Fix possible race with block_inactive
  2024-12-17 17:48 [PULL 00/17] Migration patches for 2024-12-17 Fabiano Rosas
                   ` (13 preceding siblings ...)
  2024-12-17 17:48 ` [PULL 14/17] migration/block: Apply late-block-active behavior to postcopy Fabiano Rosas
@ 2024-12-17 17:48 ` Fabiano Rosas
  2024-12-17 17:48 ` [PULL 16/17] migration/block: Rewrite disk activation Fabiano Rosas
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 29+ messages in thread
From: Fabiano Rosas @ 2024-12-17 17:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu

From: Peter Xu <peterx@redhat.com>

Src QEMU sets block_inactive=true very early before the invalidation takes
place.  It means if something wrong happened during setting the flag but
before reaching qemu_savevm_state_complete_precopy_non_iterable() where it
did the invalidation work, it'll make block_inactive flag inconsistent.

For example, think about when qemu_savevm_state_complete_precopy_iterable()
can fail: it will have block_inactive set to true even if all block drives
are active.

Fix that by only update the flag after the invalidation is done.

No Fixes for any commit, because it's not an issue if bdrv_activate_all()
is re-entrant upon all-active disks - false positive block_inactive can
bring nothing more than "trying to active the blocks but they're already
active".  However let's still do it right to avoid the inconsistent flag
v.s. reality.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Message-Id: <20241206230838.1111496-6-peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 migration/migration.c | 9 +++------
 migration/savevm.c    | 2 ++
 2 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index e6db9cfc50..ba5deec5bc 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2749,14 +2749,11 @@ static int migration_completion_precopy(MigrationState *s,
         goto out_unlock;
     }
 
-    /*
-     * Inactivate disks except in COLO, and track that we have done so in order
-     * to remember to reactivate them if migration fails or is cancelled.
-     */
-    s->block_inactive = !migrate_colo();
     migration_rate_set(RATE_LIMIT_DISABLED);
+
+    /* Inactivate disks except in COLO */
     ret = qemu_savevm_state_complete_precopy(s->to_dst_file, false,
-                                             s->block_inactive);
+                                             !migrate_colo());
 out_unlock:
     bql_unlock();
     return ret;
diff --git a/migration/savevm.c b/migration/savevm.c
index 80726726fe..706b77ffab 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1558,6 +1558,8 @@ int qemu_savevm_state_complete_precopy_non_iterable(QEMUFile *f,
             qemu_file_set_error(f, ret);
             return ret;
         }
+        /* Remember that we did this */
+        s->block_inactive = true;
     }
     if (!in_postcopy) {
         /* Postcopy stream will still be going */
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PULL 16/17] migration/block: Rewrite disk activation
  2024-12-17 17:48 [PULL 00/17] Migration patches for 2024-12-17 Fabiano Rosas
                   ` (14 preceding siblings ...)
  2024-12-17 17:48 ` [PULL 15/17] migration/block: Fix possible race with block_inactive Fabiano Rosas
@ 2024-12-17 17:48 ` Fabiano Rosas
  2024-12-17 17:48 ` [PULL 17/17] tests/qtest/migration: Fix compile errors when CONFIG_UADK is set Fabiano Rosas
  2024-12-19 12:32 ` [PULL 00/17] Migration patches for 2024-12-17 Stefan Hajnoczi
  17 siblings, 0 replies; 29+ messages in thread
From: Fabiano Rosas @ 2024-12-17 17:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu

From: Peter Xu <peterx@redhat.com>

This patch proposes a flag to maintain disk activation status globally.  It
mostly rewrites disk activation mgmt for QEMU, including COLO and QMP
command xen_save_devices_state.

Backgrounds
===========

We have two problems on disk activations, one resolved, one not.

Problem 1: disk activation recover (for switchover interruptions)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

When migration is either cancelled or failed during switchover, especially
when after the disks are inactivated, QEMU needs to remember re-activate
the disks again before vm starts.

It used to be done separately in two paths: one in qmp_migrate_cancel(),
the other one in the failure path of migration_completion().

It used to be fixed in different commits, all over the places in QEMU.  So
these are the relevant changes I saw, I'm not sure if it's complete list:

 - In 2016, commit fe904ea824 ("migration: regain control of images when
   migration fails to complete")

 - In 2017, commit 1d2acc3162 ("migration: re-active images while migration
   been canceled after inactive them")

 - In 2023, commit 6dab4c93ec ("migration: Attempt disk reactivation in
   more failure scenarios")

Now since we have a slightly better picture maybe we can unify the
reactivation in a single path.

One side benefit of doing so is, we can move the disk operation outside QMP
command "migrate_cancel".  It's possible that in the future we may want to
make "migrate_cancel" be OOB-compatible, while that requires the command
doesn't need BQL in the first place.  This will already do that and make
migrate_cancel command lightweight.

Problem 2: disk invalidation on top of invalidated disks
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This is an unresolved bug for current QEMU.  Link in "Resolves:" at the
end.  It turns out besides the src switchover phase (problem 1 above), QEMU
also needs to remember block activation on destination.

Consider two continuous migration in a row, where the VM was always paused.
In that scenario, the disks are not activated even until migration
completed in the 1st round.  When the 2nd round starts, if QEMU doesn't
know the status of the disks, it needs to try inactivate the disk again.

Here the issue is the block layer API bdrv_inactivate_all() will crash a
QEMU if invoked on already inactive disks for the 2nd migration.  For
detail, see the bug link at the end.

Implementation
==============

This patch proposes to maintain disk activation with a global flag, so we
know:

  - If we used to inactivate disks for migration, but migration got
  cancelled, or failed, QEMU will know it should reactivate the disks.

  - On incoming side, if the disks are never activated but then another
  migration is triggered, QEMU should be able to tell that inactivate is
  not needed for the 2nd migration.

We used to have disk_inactive, but it only solves the 1st issue, not the
2nd.  Also, it's done in completely separate paths so it's extremely hard
to follow either how the flag changes, or the duration that the flag is
valid, and when we will reactivate the disks.

Convert the existing disk_inactive flag into that global flag (also invert
its naming), and maintain the disk activation status for the whole
lifecycle of qemu.  That includes the incoming QEMU.

Put both of the error cases of source migration (failure, cancelled)
together into migration_iteration_finish(), which will be invoked for
either of the scenario.  So from that part QEMU should behave the same as
before.  However with such global maintenance on disk activation status, we
not only cleanup quite a few temporary paths that we try to maintain the
disk activation status (e.g. in postcopy code), meanwhile it fixes the
crash for problem 2 in one shot.

For freshly started QEMU, the flag is initialized to TRUE showing that the
QEMU owns the disks by default.

For incoming migrated QEMU, the flag will be initialized to FALSE once and
for all showing that the dest QEMU doesn't own the disks until switchover.
That is guaranteed by the "once" variable.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2395
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Message-Id: <20241206230838.1111496-7-peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 include/migration/misc.h |  4 ++
 migration/block-active.c | 94 ++++++++++++++++++++++++++++++++++++++++
 migration/colo.c         |  2 +-
 migration/meson.build    |  1 +
 migration/migration.c    | 80 ++++++++--------------------------
 migration/migration.h    |  6 +--
 migration/savevm.c       | 33 ++++++--------
 migration/trace-events   |  3 ++
 monitor/qmp-cmds.c       |  8 +---
 9 files changed, 140 insertions(+), 91 deletions(-)
 create mode 100644 migration/block-active.c

diff --git a/include/migration/misc.h b/include/migration/misc.h
index 804eb23c06..e68a473feb 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -106,4 +106,8 @@ bool migration_incoming_postcopy_advised(void);
 /* True if background snapshot is active */
 bool migration_in_bg_snapshot(void);
 
+/* Wrapper for block active/inactive operations */
+bool migration_block_activate(Error **errp);
+bool migration_block_inactivate(void);
+
 #endif
diff --git a/migration/block-active.c b/migration/block-active.c
new file mode 100644
index 0000000000..d477cf8182
--- /dev/null
+++ b/migration/block-active.c
@@ -0,0 +1,94 @@
+/*
+ * Block activation tracking for migration purpose
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ *
+ * Copyright (C) 2024 Red Hat, Inc.
+ */
+#include "qemu/osdep.h"
+#include "block/block.h"
+#include "qapi/error.h"
+#include "migration/migration.h"
+#include "qemu/error-report.h"
+#include "trace.h"
+
+/*
+ * Migration-only cache to remember the block layer activation status.
+ * Protected by BQL.
+ *
+ * We need this because..
+ *
+ * - Migration can fail after block devices are invalidated (during
+ *   switchover phase).  When that happens, we need to be able to recover
+ *   the block drive status by re-activating them.
+ *
+ * - Currently bdrv_inactivate_all() is not safe to be invoked on top of
+ *   invalidated drives (even if bdrv_activate_all() is actually safe to be
+ *   called any time!).  It means remembering this could help migration to
+ *   make sure it won't invalidate twice in a row, crashing QEMU.  It can
+ *   happen when we migrate a PAUSED VM from host1 to host2, then migrate
+ *   again to host3 without starting it.  TODO: a cleaner solution is to
+ *   allow safe invoke of bdrv_inactivate_all() at anytime, like
+ *   bdrv_activate_all().
+ *
+ * For freshly started QEMU, the flag is initialized to TRUE reflecting the
+ * scenario where QEMU owns block device ownerships.
+ *
+ * For incoming QEMU taking a migration stream, the flag is initialized to
+ * FALSE reflecting that the incoming side doesn't own the block devices,
+ * not until switchover happens.
+ */
+static bool migration_block_active;
+
+/* Setup the disk activation status */
+void migration_block_active_setup(bool active)
+{
+    migration_block_active = active;
+}
+
+bool migration_block_activate(Error **errp)
+{
+    ERRP_GUARD();
+
+    assert(bql_locked());
+
+    if (migration_block_active) {
+        trace_migration_block_activation("active-skipped");
+        return true;
+    }
+
+    trace_migration_block_activation("active");
+
+    bdrv_activate_all(errp);
+    if (*errp) {
+        error_report_err(error_copy(*errp));
+        return false;
+    }
+
+    migration_block_active = true;
+    return true;
+}
+
+bool migration_block_inactivate(void)
+{
+    int ret;
+
+    assert(bql_locked());
+
+    if (!migration_block_active) {
+        trace_migration_block_activation("inactive-skipped");
+        return true;
+    }
+
+    trace_migration_block_activation("inactive");
+
+    ret = bdrv_inactivate_all();
+    if (ret) {
+        error_report("%s: bdrv_inactivate_all() failed: %d",
+                     __func__, ret);
+        return false;
+    }
+
+    migration_block_active = false;
+    return true;
+}
diff --git a/migration/colo.c b/migration/colo.c
index 9590f281d0..ae3387a7a4 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -836,7 +836,7 @@ static void *colo_process_incoming_thread(void *opaque)
 
     /* Make sure all file formats throw away their mutable metadata */
     bql_lock();
-    bdrv_activate_all(&local_err);
+    migration_block_activate(&local_err);
     bql_unlock();
     if (local_err) {
         error_report_err(local_err);
diff --git a/migration/meson.build b/migration/meson.build
index d53cf3417a..dac687ee3a 100644
--- a/migration/meson.build
+++ b/migration/meson.build
@@ -11,6 +11,7 @@ migration_files = files(
 
 system_ss.add(files(
   'block-dirty-bitmap.c',
+  'block-active.c',
   'channel.c',
   'channel-block.c',
   'cpu-throttle.c',
diff --git a/migration/migration.c b/migration/migration.c
index ba5deec5bc..d755ccb03d 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -738,7 +738,6 @@ static void qemu_start_incoming_migration(const char *uri, bool has_channels,
 
 static void process_incoming_migration_bh(void *opaque)
 {
-    Error *local_err = NULL;
     MigrationIncomingState *mis = opaque;
 
     trace_vmstate_downtime_checkpoint("dst-precopy-bh-enter");
@@ -769,11 +768,7 @@ static void process_incoming_migration_bh(void *opaque)
              * Make sure all file formats throw away their mutable
              * metadata.  If error, don't restart the VM yet.
              */
-            bdrv_activate_all(&local_err);
-            if (local_err) {
-                error_report_err(local_err);
-                local_err = NULL;
-            } else {
+            if (migration_block_activate(NULL)) {
                 vm_start();
             }
         } else {
@@ -1552,16 +1547,6 @@ static void migrate_fd_cancel(MigrationState *s)
             }
         }
     }
-    if (s->state == MIGRATION_STATUS_CANCELLING && s->block_inactive) {
-        Error *local_err = NULL;
-
-        bdrv_activate_all(&local_err);
-        if (local_err) {
-            error_report_err(local_err);
-        } else {
-            s->block_inactive = false;
-        }
-    }
 }
 
 void migration_add_notifier_mode(NotifierWithReturn *notify,
@@ -1860,6 +1845,12 @@ void qmp_migrate_incoming(const char *uri, bool has_channels,
         return;
     }
 
+    /*
+     * Newly setup incoming QEMU.  Mark the block active state to reflect
+     * that the src currently owns the disks.
+     */
+    migration_block_active_setup(false);
+
     once = false;
 }
 
@@ -2512,7 +2503,6 @@ static int postcopy_start(MigrationState *ms, Error **errp)
     QIOChannelBuffer *bioc;
     QEMUFile *fb;
     uint64_t bandwidth = migrate_max_postcopy_bandwidth();
-    bool restart_block = false;
     int cur_state = MIGRATION_STATUS_ACTIVE;
 
     if (migrate_postcopy_preempt()) {
@@ -2548,13 +2538,10 @@ static int postcopy_start(MigrationState *ms, Error **errp)
         goto fail;
     }
 
-    ret = bdrv_inactivate_all();
-    if (ret < 0) {
-        error_setg_errno(errp, -ret, "%s: Failed in bdrv_inactivate_all()",
-                         __func__);
+    if (!migration_block_inactivate()) {
+        error_setg(errp, "%s: Failed in bdrv_inactivate_all()", __func__);
         goto fail;
     }
-    restart_block = true;
 
     /*
      * Cause any non-postcopiable, but iterative devices to
@@ -2624,8 +2611,6 @@ static int postcopy_start(MigrationState *ms, Error **errp)
         goto fail_closefb;
     }
 
-    restart_block = false;
-
     /* Now send that blob */
     if (qemu_savevm_send_packaged(ms->to_dst_file, bioc->data, bioc->usage)) {
         error_setg(errp, "%s: Failed to send packaged data", __func__);
@@ -2670,17 +2655,7 @@ fail_closefb:
 fail:
     migrate_set_state(&ms->state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
                           MIGRATION_STATUS_FAILED);
-    if (restart_block) {
-        /* A failure happened early enough that we know the destination hasn't
-         * accessed block devices, so we're safe to recover.
-         */
-        Error *local_err = NULL;
-
-        bdrv_activate_all(&local_err);
-        if (local_err) {
-            error_report_err(local_err);
-        }
-    }
+    migration_block_activate(NULL);
     migration_call_notifiers(ms, MIG_EVENT_PRECOPY_FAILED, NULL);
     bql_unlock();
     return -1;
@@ -2778,31 +2753,6 @@ static void migration_completion_postcopy(MigrationState *s)
     trace_migration_completion_postcopy_end_after_complete();
 }
 
-static void migration_completion_failed(MigrationState *s,
-                                        int current_active_state)
-{
-    if (s->block_inactive && (s->state == MIGRATION_STATUS_ACTIVE ||
-                              s->state == MIGRATION_STATUS_DEVICE)) {
-        /*
-         * If not doing postcopy, vm_start() will be called: let's
-         * regain control on images.
-         */
-        Error *local_err = NULL;
-
-        bql_lock();
-        bdrv_activate_all(&local_err);
-        if (local_err) {
-            error_report_err(local_err);
-        } else {
-            s->block_inactive = false;
-        }
-        bql_unlock();
-    }
-
-    migrate_set_state(&s->state, current_active_state,
-                      MIGRATION_STATUS_FAILED);
-}
-
 /**
  * migration_completion: Used by migration_thread when there's not much left.
  *   The caller 'breaks' the loop when this returns.
@@ -2856,7 +2806,8 @@ fail:
         error_free(local_err);
     }
 
-    migration_completion_failed(s, current_active_state);
+    migrate_set_state(&s->state, current_active_state,
+                      MIGRATION_STATUS_FAILED);
 }
 
 /**
@@ -3286,6 +3237,11 @@ static void migration_iteration_finish(MigrationState *s)
     case MIGRATION_STATUS_FAILED:
     case MIGRATION_STATUS_CANCELLED:
     case MIGRATION_STATUS_CANCELLING:
+        /*
+         * Re-activate the block drives if they're inactivated.  Note, COLO
+         * shouldn't use block_active at all, so it should be no-op there.
+         */
+        migration_block_activate(NULL);
         if (runstate_is_live(s->vm_old_state)) {
             if (!runstate_check(RUN_STATE_SHUTDOWN)) {
                 vm_start();
@@ -3858,6 +3814,8 @@ static void migration_instance_init(Object *obj)
     ms->state = MIGRATION_STATUS_NONE;
     ms->mbps = -1;
     ms->pages_per_second = -1;
+    /* Freshly started QEMU owns all the block devices */
+    migration_block_active_setup(true);
     qemu_sem_init(&ms->pause_sem, 0);
     qemu_mutex_init(&ms->error_mutex);
 
diff --git a/migration/migration.h b/migration/migration.h
index 3857905c0e..fab3cad2b9 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -370,9 +370,6 @@ struct MigrationState {
     /* Flag set once the migration thread is running (and needs joining) */
     bool migration_thread_running;
 
-    /* Flag set once the migration thread called bdrv_inactivate_all */
-    bool block_inactive;
-
     /* Migration is waiting for guest to unplug device */
     QemuSemaphore wait_unplug_sem;
 
@@ -556,4 +553,7 @@ void migration_bitmap_sync_precopy(bool last_stage);
 /* migration/block-dirty-bitmap.c */
 void dirty_bitmap_mig_init(void);
 
+/* migration/block-active.c */
+void migration_block_active_setup(bool active);
+
 #endif
diff --git a/migration/savevm.c b/migration/savevm.c
index 706b77ffab..969a994a85 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1547,19 +1547,18 @@ int qemu_savevm_state_complete_precopy_non_iterable(QEMUFile *f,
     }
 
     if (inactivate_disks) {
-        /* Inactivate before sending QEMU_VM_EOF so that the
-         * bdrv_activate_all() on the other end won't fail. */
-        ret = bdrv_inactivate_all();
-        if (ret) {
-            error_setg(&local_err, "%s: bdrv_inactivate_all() failed (%d)",
-                       __func__, ret);
+        /*
+         * Inactivate before sending QEMU_VM_EOF so that the
+         * bdrv_activate_all() on the other end won't fail.
+         */
+        if (!migration_block_inactivate()) {
+            error_setg(&local_err, "%s: bdrv_inactivate_all() failed",
+                       __func__);
             migrate_set_error(ms, local_err);
             error_report_err(local_err);
-            qemu_file_set_error(f, ret);
+            qemu_file_set_error(f, -EFAULT);
             return ret;
         }
-        /* Remember that we did this */
-        s->block_inactive = true;
     }
     if (!in_postcopy) {
         /* Postcopy stream will still be going */
@@ -2123,7 +2122,6 @@ static int loadvm_postcopy_handle_listen(MigrationIncomingState *mis)
 
 static void loadvm_postcopy_handle_run_bh(void *opaque)
 {
-    Error *local_err = NULL;
     MigrationIncomingState *mis = opaque;
 
     trace_vmstate_downtime_checkpoint("dst-postcopy-bh-enter");
@@ -2146,12 +2144,11 @@ static void loadvm_postcopy_handle_run_bh(void *opaque)
          * Make sure all file formats throw away their mutable metadata.
          * If we get an error here, just don't restart the VM yet.
          */
-        bdrv_activate_all(&local_err);
+        bool success = migration_block_activate(NULL);
+
         trace_vmstate_downtime_checkpoint("dst-postcopy-bh-cache-invalidated");
-        if (local_err) {
-            error_report_err(local_err);
-            local_err = NULL;
-        } else {
+
+        if (success) {
             vm_start();
         }
     } else {
@@ -3193,11 +3190,7 @@ void qmp_xen_save_devices_state(const char *filename, bool has_live, bool live,
          * side of the migration take control of the images.
          */
         if (live && !saved_vm_running) {
-            ret = bdrv_inactivate_all();
-            if (ret) {
-                error_setg(errp, "%s: bdrv_inactivate_all() failed (%d)",
-                           __func__, ret);
-            }
+            migration_block_inactivate();
         }
     }
 
diff --git a/migration/trace-events b/migration/trace-events
index bb0e0cc6dc..b82a1c5e40 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -383,3 +383,6 @@ migration_pagecache_insert(void) "Error allocating page"
 # cpu-throttle.c
 cpu_throttle_set(int new_throttle_pct)  "set guest CPU throttled by %d%%"
 cpu_throttle_dirty_sync(void) ""
+
+# block-active.c
+migration_block_activation(const char *name) "%s"
diff --git a/monitor/qmp-cmds.c b/monitor/qmp-cmds.c
index 76f21e8af3..6f76d9beaf 100644
--- a/monitor/qmp-cmds.c
+++ b/monitor/qmp-cmds.c
@@ -31,6 +31,7 @@
 #include "qapi/type-helpers.h"
 #include "hw/mem/memory-device.h"
 #include "hw/intc/intc.h"
+#include "migration/misc.h"
 
 NameInfo *qmp_query_name(Error **errp)
 {
@@ -103,13 +104,8 @@ void qmp_cont(Error **errp)
          * Continuing after completed migration. Images have been
          * inactivated to allow the destination to take control. Need to
          * get control back now.
-         *
-         * If there are no inactive block nodes (e.g. because the VM was
-         * just paused rather than completing a migration),
-         * bdrv_inactivate_all() simply doesn't do anything.
          */
-        bdrv_activate_all(&local_err);
-        if (local_err) {
+        if (!migration_block_activate(&local_err)) {
             error_propagate(errp, local_err);
             return;
         }
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PULL 17/17] tests/qtest/migration: Fix compile errors when CONFIG_UADK is set
  2024-12-17 17:48 [PULL 00/17] Migration patches for 2024-12-17 Fabiano Rosas
                   ` (15 preceding siblings ...)
  2024-12-17 17:48 ` [PULL 16/17] migration/block: Rewrite disk activation Fabiano Rosas
@ 2024-12-17 17:48 ` Fabiano Rosas
  2024-12-19 12:32 ` [PULL 00/17] Migration patches for 2024-12-17 Stefan Hajnoczi
  17 siblings, 0 replies; 29+ messages in thread
From: Fabiano Rosas @ 2024-12-17 17:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu, Shameer Kolothum

From: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>

Removes accidental inclusion of unrelated functions within CONFIG_UADK
as this causes compile errors like:

error: redefinition of ‘migrate_hook_start_xbzrle’

Fixes: 932f74f3fe6e ("tests/qtest/migration: Split compression tests from migration-test.c")
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Message-Id: <20241217131046.83844-1-shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 tests/qtest/migration/compression-tests.c | 54 -----------------------
 1 file changed, 54 deletions(-)

diff --git a/tests/qtest/migration/compression-tests.c b/tests/qtest/migration/compression-tests.c
index 6de87bc47d..d78f1f11f1 100644
--- a/tests/qtest/migration/compression-tests.c
+++ b/tests/qtest/migration/compression-tests.c
@@ -88,59 +88,6 @@ migrate_hook_start_precopy_tcp_multifd_uadk(QTestState *from,
     return migrate_hook_start_precopy_tcp_multifd_common(from, to, "uadk");
 }
 
-static void *
-migrate_hook_start_xbzrle(QTestState *from,
-                          QTestState *to)
-{
-    migrate_set_parameter_int(from, "xbzrle-cache-size", 33554432);
-
-    migrate_set_capability(from, "xbzrle", true);
-    migrate_set_capability(to, "xbzrle", true);
-
-    return NULL;
-}
-
-static void test_precopy_unix_xbzrle(void)
-{
-    g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
-    MigrateCommon args = {
-        .connect_uri = uri,
-        .listen_uri = uri,
-        .start_hook = migrate_hook_start_xbzrle,
-        .iterations = 2,
-        /*
-         * XBZRLE needs pages to be modified when doing the 2nd+ round
-         * iteration to have real data pushed to the stream.
-         */
-        .live = true,
-    };
-
-    test_precopy_common(&args);
-}
-
-static void *
-migrate_hook_start_precopy_tcp_multifd_zlib(QTestState *from,
-                                            QTestState *to)
-{
-    /*
-     * Overloading this test to also check that set_parameter does not error.
-     * This is also done in the tests for the other compression methods.
-     */
-    migrate_set_parameter_int(from, "multifd-zlib-level", 2);
-    migrate_set_parameter_int(to, "multifd-zlib-level", 2);
-
-    return migrate_hook_start_precopy_tcp_multifd_common(from, to, "zlib");
-}
-
-static void test_multifd_tcp_zlib(void)
-{
-    MigrateCommon args = {
-        .listen_uri = "defer",
-        .start_hook = migrate_hook_start_precopy_tcp_multifd_zlib,
-    };
-    test_precopy_common(&args);
-}
-
 static void test_multifd_tcp_uadk(void)
 {
     MigrateCommon args = {
@@ -151,7 +98,6 @@ static void test_multifd_tcp_uadk(void)
 }
 #endif /* CONFIG_UADK */
 
-
 static void *
 migrate_hook_start_xbzrle(QTestState *from,
                           QTestState *to)
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PULL 00/17] Migration patches for 2024-12-17
  2024-12-17 17:48 [PULL 00/17] Migration patches for 2024-12-17 Fabiano Rosas
                   ` (16 preceding siblings ...)
  2024-12-17 17:48 ` [PULL 17/17] tests/qtest/migration: Fix compile errors when CONFIG_UADK is set Fabiano Rosas
@ 2024-12-19 12:32 ` Stefan Hajnoczi
  2024-12-19 18:53   ` Fabiano Rosas
  17 siblings, 1 reply; 29+ messages in thread
From: Stefan Hajnoczi @ 2024-12-19 12:32 UTC (permalink / raw)
  To: Fabiano Rosas; +Cc: qemu-devel, Peter Xu

[-- Attachment #1: Type: text/plain, Size: 2190 bytes --]

Hi Fabiano,
Please take a look at this CI failure:

>>> MALLOC_PERTURB_=61 QTEST_QEMU_BINARY=./qemu-system-s390x UBSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 QTEST_QEMU_IMG=./qemu-img MESON_TEST_ITERATION=1 MSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 ASAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1 PYTHON=/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/pyvenv/bin/python3 QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon G_TEST_DBUS_DAEMON=/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/tests/dbus-vmstate-daemon.sh /home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/tests/qtest/migration-test --tap -k
――――――――――――――――――――――――――――――――――――― ✀  ―――――――――――――――――――――――――――――――――――――
stderr:
Traceback (most recent call last):
  File "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py", line 688, in <module>
    dump.read(dump_memory = args.memory)
  File "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py", line 625, in read
    section.read()
  File "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py", line 461, in read
    field['data'] = reader(field, self.file)
  File "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py", line 434, in __init__
    for field in self.desc['struct']['fields']:
KeyError: 'fields'
warning: fd: migration to a file is deprecated. Use file: instead.
warning: fd: migration to a file is deprecated. Use file: instead.
**
ERROR:../tests/qtest/migration-test.c:36:main: assertion failed (ret == 0): (1 == 0)
(test program exited with status code -6)

https://gitlab.com/qemu-project/qemu/-/jobs/8681858344#L8190

If you find this pull request caused the failure, please send a new
revision. Otherwise please let me know so we can continue to
investigate.

Thanks,
Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PULL 00/17] Migration patches for 2024-12-17
  2024-12-19 12:32 ` [PULL 00/17] Migration patches for 2024-12-17 Stefan Hajnoczi
@ 2024-12-19 18:53   ` Fabiano Rosas
  2024-12-20 16:28     ` Peter Xu
  0 siblings, 1 reply; 29+ messages in thread
From: Fabiano Rosas @ 2024-12-19 18:53 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-devel, Peter Xu, Thomas Huth

Stefan Hajnoczi <stefanha@redhat.com> writes:

> Hi Fabiano,
> Please take a look at this CI failure:
>
>>>> MALLOC_PERTURB_=61 QTEST_QEMU_BINARY=./qemu-system-s390x UBSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 QTEST_QEMU_IMG=./qemu-img MESON_TEST_ITERATION=1 MSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 ASAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1 PYTHON=/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/pyvenv/bin/python3 QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon G_TEST_DBUS_DAEMON=/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/tests/dbus-vmstate-daemon.sh /home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/tests/qtest/migration-test --tap -k
> ――――――――――――――――――――――――――――――――――――― ✀  ―――――――――――――――――――――――――――――――――――――
> stderr:
> Traceback (most recent call last):
>   File "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py", line 688, in <module>
>     dump.read(dump_memory = args.memory)
>   File "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py", line 625, in read
>     section.read()
>   File "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py", line 461, in read
>     field['data'] = reader(field, self.file)
>   File "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py", line 434, in __init__
>     for field in self.desc['struct']['fields']:
> KeyError: 'fields'

This is the command line that runs only this specific test:

PYTHON=/usr/bin/python3.11 QTEST_QEMU_BINARY=./qemu-system-s390x
./tests/qtest/migration-test -p /s390x/migration/analyze-script

I cannot reproduce in migration-next nor in the detached HEAD that the
pipeline ran in (had to download the tarball from gitlab).

The only s390 patch in this PR is one that I can test just fine with
TCG, so there shouldn't be any difference from KVM (i.e. there should be
no state being migrated with KVM that is not already migrated with TCG).

> warning: fd: migration to a file is deprecated. Use file: instead.
> warning: fd: migration to a file is deprecated. Use file: instead.

This is harmless.

> **
> ERROR:../tests/qtest/migration-test.c:36:main: assertion failed (ret == 0): (1 == 0)
> (test program exited with status code -6)

This is the assert at the end of the tests, irrelevant.

>
> https://gitlab.com/qemu-project/qemu/-/jobs/8681858344#L8190
>
> If you find this pull request caused the failure, please send a new
> revision. Otherwise please let me know so we can continue to
> investigate.

I don't have an s390x host at hand so the only thing I can to is to drop
that patch and hope that resolves the problem. @Peter, @Thomas, any
other ideas? Can you verify this on your end?


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PULL 00/17] Migration patches for 2024-12-17
  2024-12-19 18:53   ` Fabiano Rosas
@ 2024-12-20 16:28     ` Peter Xu
  2025-01-02  9:32       ` Thomas Huth
  0 siblings, 1 reply; 29+ messages in thread
From: Peter Xu @ 2024-12-20 16:28 UTC (permalink / raw)
  To: Fabiano Rosas; +Cc: Stefan Hajnoczi, qemu-devel, Thomas Huth

On Thu, Dec 19, 2024 at 03:53:22PM -0300, Fabiano Rosas wrote:
> Stefan Hajnoczi <stefanha@redhat.com> writes:
> 
> > Hi Fabiano,
> > Please take a look at this CI failure:
> >
> >>>> MALLOC_PERTURB_=61 QTEST_QEMU_BINARY=./qemu-system-s390x UBSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 QTEST_QEMU_IMG=./qemu-img MESON_TEST_ITERATION=1 MSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 ASAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1 PYTHON=/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/pyvenv/bin/python3 QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon G_TEST_DBUS_DAEMON=/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/tests/dbus-vmstate-daemon.sh /home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/tests/qtest/migration-test --tap -k
> > ――――――――――――――――――――――――――――――――――――― ✀  ―――――――――――――――――――――――――――――――――――――
> > stderr:
> > Traceback (most recent call last):
> >   File "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py", line 688, in <module>
> >     dump.read(dump_memory = args.memory)
> >   File "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py", line 625, in read
> >     section.read()
> >   File "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py", line 461, in read
> >     field['data'] = reader(field, self.file)
> >   File "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py", line 434, in __init__
> >     for field in self.desc['struct']['fields']:
> > KeyError: 'fields'
> 
> This is the command line that runs only this specific test:
> 
> PYTHON=/usr/bin/python3.11 QTEST_QEMU_BINARY=./qemu-system-s390x
> ./tests/qtest/migration-test -p /s390x/migration/analyze-script
> 
> I cannot reproduce in migration-next nor in the detached HEAD that the
> pipeline ran in (had to download the tarball from gitlab).
> 
> The only s390 patch in this PR is one that I can test just fine with
> TCG, so there shouldn't be any difference from KVM (i.e. there should be
> no state being migrated with KVM that is not already migrated with TCG).
> 
> > warning: fd: migration to a file is deprecated. Use file: instead.
> > warning: fd: migration to a file is deprecated. Use file: instead.
> 
> This is harmless.
> 
> > **
> > ERROR:../tests/qtest/migration-test.c:36:main: assertion failed (ret == 0): (1 == 0)
> > (test program exited with status code -6)
> 
> This is the assert at the end of the tests, irrelevant.
> 
> >
> > https://gitlab.com/qemu-project/qemu/-/jobs/8681858344#L8190
> >
> > If you find this pull request caused the failure, please send a new
> > revision. Otherwise please let me know so we can continue to
> > investigate.
> 
> I don't have an s390x host at hand so the only thing I can to is to drop
> that patch and hope that resolves the problem. @Peter, @Thomas, any
> other ideas? Can you verify this on your end?

Cannot reproduce either here, x86_64 host only.  The report was from s390
host, though.  I'm not familiar with the s390 patch, I wonder if any of you
could use plain brain power to figure more things out.

We could wait for 1-2 more days to see whether Thomas can figure it out,
hopefully easily reproduceable on s390.. or we can also leave that for
later.  And if the current issue on such fix is s390-host-only, might be
easier to be picked up by s390 tree, perhaps?

In all cases, the regression is since 9.1, so maybe we don't need to rush
anything in.

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PULL 00/17] Migration patches for 2024-12-17
  2024-12-20 16:28     ` Peter Xu
@ 2025-01-02  9:32       ` Thomas Huth
  2025-01-03 18:30         ` Fabiano Rosas
  0 siblings, 1 reply; 29+ messages in thread
From: Thomas Huth @ 2025-01-02  9:32 UTC (permalink / raw)
  To: Peter Xu, Fabiano Rosas; +Cc: Stefan Hajnoczi, qemu-devel

On 20/12/2024 17.28, Peter Xu wrote:
> On Thu, Dec 19, 2024 at 03:53:22PM -0300, Fabiano Rosas wrote:
>> Stefan Hajnoczi <stefanha@redhat.com> writes:
>>
>>> Hi Fabiano,
>>> Please take a look at this CI failure:
>>>
>>>>>> MALLOC_PERTURB_=61 QTEST_QEMU_BINARY=./qemu-system-s390x UBSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 QTEST_QEMU_IMG=./qemu-img MESON_TEST_ITERATION=1 MSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 ASAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1 PYTHON=/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/pyvenv/bin/python3 QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon G_TEST_DBUS_DAEMON=/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/tests/dbus-vmstate-daemon.sh /home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/tests/qtest/migration-test --tap -k
>>> ――――――――――――――――――――――――――――――――――――― ✀  ―――――――――――――――――――――――――――――――――――――
>>> stderr:
>>> Traceback (most recent call last):
>>>    File "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py", line 688, in <module>
>>>      dump.read(dump_memory = args.memory)
>>>    File "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py", line 625, in read
>>>      section.read()
>>>    File "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py", line 461, in read
>>>      field['data'] = reader(field, self.file)
>>>    File "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py", line 434, in __init__
>>>      for field in self.desc['struct']['fields']:
>>> KeyError: 'fields'
>>
>> This is the command line that runs only this specific test:
>>
>> PYTHON=/usr/bin/python3.11 QTEST_QEMU_BINARY=./qemu-system-s390x
>> ./tests/qtest/migration-test -p /s390x/migration/analyze-script
>>
>> I cannot reproduce in migration-next nor in the detached HEAD that the
>> pipeline ran in (had to download the tarball from gitlab).
>>
>> The only s390 patch in this PR is one that I can test just fine with
>> TCG, so there shouldn't be any difference from KVM (i.e. there should be
>> no state being migrated with KVM that is not already migrated with TCG).
>>
>>> warning: fd: migration to a file is deprecated. Use file: instead.
>>> warning: fd: migration to a file is deprecated. Use file: instead.
>>
>> This is harmless.
>>
>>> **
>>> ERROR:../tests/qtest/migration-test.c:36:main: assertion failed (ret == 0): (1 == 0)
>>> (test program exited with status code -6)
>>
>> This is the assert at the end of the tests, irrelevant.
>>
>>>
>>> https://gitlab.com/qemu-project/qemu/-/jobs/8681858344#L8190
>>>
>>> If you find this pull request caused the failure, please send a new
>>> revision. Otherwise please let me know so we can continue to
>>> investigate.
>>
>> I don't have an s390x host at hand so the only thing I can to is to drop
>> that patch and hope that resolves the problem. @Peter, @Thomas, any
>> other ideas? Can you verify this on your end?
> 
> Cannot reproduce either here, x86_64 host only.  The report was from s390
> host, though.  I'm not familiar with the s390 patch, I wonder if any of you
> could use plain brain power to figure more things out.
> 
> We could wait for 1-2 more days to see whether Thomas can figure it out,
> hopefully easily reproduceable on s390.. or we can also leave that for
> later.  And if the current issue on such fix is s390-host-only, might be
> easier to be picked up by s390 tree, perhaps?

I tested migration-20241217-pull-request on a s390x (RHEL) host, but I 
cannot reproduce the issue there - make check-qtest works without any 
problems. Is it maybe related to that specific Ubuntu installation?

  Thomas



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PULL 00/17] Migration patches for 2024-12-17
  2025-01-02  9:32       ` Thomas Huth
@ 2025-01-03 18:30         ` Fabiano Rosas
  2025-01-03 20:31           ` Stefan Hajnoczi
  0 siblings, 1 reply; 29+ messages in thread
From: Fabiano Rosas @ 2025-01-03 18:30 UTC (permalink / raw)
  To: Thomas Huth, Peter Xu; +Cc: Stefan Hajnoczi, qemu-devel

Thomas Huth <thuth@redhat.com> writes:

> On 20/12/2024 17.28, Peter Xu wrote:
>> On Thu, Dec 19, 2024 at 03:53:22PM -0300, Fabiano Rosas wrote:
>>> Stefan Hajnoczi <stefanha@redhat.com> writes:
>>>
>>>> Hi Fabiano,
>>>> Please take a look at this CI failure:
>>>>
>>>>>>> MALLOC_PERTURB_=61 QTEST_QEMU_BINARY=./qemu-system-s390x UBSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 QTEST_QEMU_IMG=./qemu-img MESON_TEST_ITERATION=1 MSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 ASAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1 PYTHON=/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/pyvenv/bin/python3 QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon G_TEST_DBUS_DAEMON=/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/tests/dbus-vmstate-daemon.sh /home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/tests/qtest/migration-test --tap -k
>>>> ――――――――――――――――――――――――――――――――――――― ✀  ―――――――――――――――――――――――――――――――――――――
>>>> stderr:
>>>> Traceback (most recent call last):
>>>>    File "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py", line 688, in <module>
>>>>      dump.read(dump_memory = args.memory)
>>>>    File "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py", line 625, in read
>>>>      section.read()
>>>>    File "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py", line 461, in read
>>>>      field['data'] = reader(field, self.file)
>>>>    File "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py", line 434, in __init__
>>>>      for field in self.desc['struct']['fields']:
>>>> KeyError: 'fields'
>>>
>>> This is the command line that runs only this specific test:
>>>
>>> PYTHON=/usr/bin/python3.11 QTEST_QEMU_BINARY=./qemu-system-s390x
>>> ./tests/qtest/migration-test -p /s390x/migration/analyze-script
>>>
>>> I cannot reproduce in migration-next nor in the detached HEAD that the
>>> pipeline ran in (had to download the tarball from gitlab).
>>>
>>> The only s390 patch in this PR is one that I can test just fine with
>>> TCG, so there shouldn't be any difference from KVM (i.e. there should be
>>> no state being migrated with KVM that is not already migrated with TCG).
>>>
>>>> warning: fd: migration to a file is deprecated. Use file: instead.
>>>> warning: fd: migration to a file is deprecated. Use file: instead.
>>>
>>> This is harmless.
>>>
>>>> **
>>>> ERROR:../tests/qtest/migration-test.c:36:main: assertion failed (ret == 0): (1 == 0)
>>>> (test program exited with status code -6)
>>>
>>> This is the assert at the end of the tests, irrelevant.
>>>
>>>>
>>>> https://gitlab.com/qemu-project/qemu/-/jobs/8681858344#L8190
>>>>
>>>> If you find this pull request caused the failure, please send a new
>>>> revision. Otherwise please let me know so we can continue to
>>>> investigate.
>>>
>>> I don't have an s390x host at hand so the only thing I can to is to drop
>>> that patch and hope that resolves the problem. @Peter, @Thomas, any
>>> other ideas? Can you verify this on your end?
>> 
>> Cannot reproduce either here, x86_64 host only.  The report was from s390
>> host, though.  I'm not familiar with the s390 patch, I wonder if any of you
>> could use plain brain power to figure more things out.
>> 
>> We could wait for 1-2 more days to see whether Thomas can figure it out,
>> hopefully easily reproduceable on s390.. or we can also leave that for
>> later.  And if the current issue on such fix is s390-host-only, might be
>> easier to be picked up by s390 tree, perhaps?
>
> I tested migration-20241217-pull-request on a s390x (RHEL) host, but I 
> cannot reproduce the issue there - make check-qtest works without any 
> problems. Is it maybe related to that specific Ubuntu installation?
>

Since we cannot reproduce outside of the staging CI, could we run that
job again with a diagnostic patch? Here's the rebased PR with the patch:

https://gitlab.com/farosas/qemu/-/commits/migration-next

(fork CI run: https://gitlab.com/farosas/qemu/-/pipelines/1610691202)

Or should I just send a v2 of this PR with the debug patch?


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PULL 00/17] Migration patches for 2024-12-17
  2025-01-03 18:30         ` Fabiano Rosas
@ 2025-01-03 20:31           ` Stefan Hajnoczi
  2025-01-03 21:00             ` Fabiano Rosas
  0 siblings, 1 reply; 29+ messages in thread
From: Stefan Hajnoczi @ 2025-01-03 20:31 UTC (permalink / raw)
  To: Fabiano Rosas; +Cc: Thomas Huth, Peter Xu, Stefan Hajnoczi, qemu-devel

On Fri, 3 Jan 2025 at 13:32, Fabiano Rosas <farosas@suse.de> wrote:
>
> Thomas Huth <thuth@redhat.com> writes:
>
> > On 20/12/2024 17.28, Peter Xu wrote:
> >> On Thu, Dec 19, 2024 at 03:53:22PM -0300, Fabiano Rosas wrote:
> >>> Stefan Hajnoczi <stefanha@redhat.com> writes:
> >>>
> >>>> Hi Fabiano,
> >>>> Please take a look at this CI failure:
> >>>>
> >>>>>>> MALLOC_PERTURB_=61 QTEST_QEMU_BINARY=./qemu-system-s390x UBSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 QTEST_QEMU_IMG=./qemu-img MESON_TEST_ITERATION=1 MSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 ASAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1 PYTHON=/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/pyvenv/bin/python3 QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon G_TEST_DBUS_DAEMON=/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/tests/dbus-vmstate-daemon.sh /home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/tests/qtest/migration-test --tap -k
> >>>> ――――――――――――――――――――――――――――――――――――― ✀  ―――――――――――――――――――――――――――――――――――――
> >>>> stderr:
> >>>> Traceback (most recent call last):
> >>>>    File "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py", line 688, in <module>
> >>>>      dump.read(dump_memory = args.memory)
> >>>>    File "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py", line 625, in read
> >>>>      section.read()
> >>>>    File "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py", line 461, in read
> >>>>      field['data'] = reader(field, self.file)
> >>>>    File "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py", line 434, in __init__
> >>>>      for field in self.desc['struct']['fields']:
> >>>> KeyError: 'fields'
> >>>
> >>> This is the command line that runs only this specific test:
> >>>
> >>> PYTHON=/usr/bin/python3.11 QTEST_QEMU_BINARY=./qemu-system-s390x
> >>> ./tests/qtest/migration-test -p /s390x/migration/analyze-script
> >>>
> >>> I cannot reproduce in migration-next nor in the detached HEAD that the
> >>> pipeline ran in (had to download the tarball from gitlab).
> >>>
> >>> The only s390 patch in this PR is one that I can test just fine with
> >>> TCG, so there shouldn't be any difference from KVM (i.e. there should be
> >>> no state being migrated with KVM that is not already migrated with TCG).
> >>>
> >>>> warning: fd: migration to a file is deprecated. Use file: instead.
> >>>> warning: fd: migration to a file is deprecated. Use file: instead.
> >>>
> >>> This is harmless.
> >>>
> >>>> **
> >>>> ERROR:../tests/qtest/migration-test.c:36:main: assertion failed (ret == 0): (1 == 0)
> >>>> (test program exited with status code -6)
> >>>
> >>> This is the assert at the end of the tests, irrelevant.
> >>>
> >>>>
> >>>> https://gitlab.com/qemu-project/qemu/-/jobs/8681858344#L8190
> >>>>
> >>>> If you find this pull request caused the failure, please send a new
> >>>> revision. Otherwise please let me know so we can continue to
> >>>> investigate.
> >>>
> >>> I don't have an s390x host at hand so the only thing I can to is to drop
> >>> that patch and hope that resolves the problem. @Peter, @Thomas, any
> >>> other ideas? Can you verify this on your end?
> >>
> >> Cannot reproduce either here, x86_64 host only.  The report was from s390
> >> host, though.  I'm not familiar with the s390 patch, I wonder if any of you
> >> could use plain brain power to figure more things out.
> >>
> >> We could wait for 1-2 more days to see whether Thomas can figure it out,
> >> hopefully easily reproduceable on s390.. or we can also leave that for
> >> later.  And if the current issue on such fix is s390-host-only, might be
> >> easier to be picked up by s390 tree, perhaps?
> >
> > I tested migration-20241217-pull-request on a s390x (RHEL) host, but I
> > cannot reproduce the issue there - make check-qtest works without any
> > problems. Is it maybe related to that specific Ubuntu installation?
> >
>
> Since we cannot reproduce outside of the staging CI, could we run that
> job again with a diagnostic patch? Here's the rebased PR with the patch:
>
> https://gitlab.com/farosas/qemu/-/commits/migration-next
>
> (fork CI run: https://gitlab.com/farosas/qemu/-/pipelines/1610691202)
>
> Or should I just send a v2 of this PR with the debug patch?

Here is the staging CI pipeline for your migration-next tree:
https://gitlab.com/qemu-project/qemu/-/pipelines/1610836485

Stefan


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PULL 00/17] Migration patches for 2024-12-17
  2025-01-03 20:31           ` Stefan Hajnoczi
@ 2025-01-03 21:00             ` Fabiano Rosas
  2025-01-03 22:34               ` Fabiano Rosas
  0 siblings, 1 reply; 29+ messages in thread
From: Fabiano Rosas @ 2025-01-03 21:00 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Thomas Huth, Peter Xu, Stefan Hajnoczi, qemu-devel

Stefan Hajnoczi <stefanha@gmail.com> writes:

> On Fri, 3 Jan 2025 at 13:32, Fabiano Rosas <farosas@suse.de> wrote:
>>
>> Thomas Huth <thuth@redhat.com> writes:
>>
>> > On 20/12/2024 17.28, Peter Xu wrote:
>> >> On Thu, Dec 19, 2024 at 03:53:22PM -0300, Fabiano Rosas wrote:
>> >>> Stefan Hajnoczi <stefanha@redhat.com> writes:
>> >>>
>> >>>> Hi Fabiano,
>> >>>> Please take a look at this CI failure:
>> >>>>
>> >>>>>>> MALLOC_PERTURB_=61 QTEST_QEMU_BINARY=./qemu-system-s390x UBSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 QTEST_QEMU_IMG=./qemu-img MESON_TEST_ITERATION=1 MSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 ASAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1 PYTHON=/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/pyvenv/bin/python3 QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon G_TEST_DBUS_DAEMON=/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/tests/dbus-vmstate-daemon.sh /home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/tests/qtest/migration-test --tap -k
>> >>>> ――――――――――――――――――――――――――――――――――――― ✀  ―――――――――――――――――――――――――――――――――――――
>> >>>> stderr:
>> >>>> Traceback (most recent call last):
>> >>>>    File "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py", line 688, in <module>
>> >>>>      dump.read(dump_memory = args.memory)
>> >>>>    File "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py", line 625, in read
>> >>>>      section.read()
>> >>>>    File "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py", line 461, in read
>> >>>>      field['data'] = reader(field, self.file)
>> >>>>    File "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py", line 434, in __init__
>> >>>>      for field in self.desc['struct']['fields']:
>> >>>> KeyError: 'fields'
>> >>>
>> >>> This is the command line that runs only this specific test:
>> >>>
>> >>> PYTHON=/usr/bin/python3.11 QTEST_QEMU_BINARY=./qemu-system-s390x
>> >>> ./tests/qtest/migration-test -p /s390x/migration/analyze-script
>> >>>
>> >>> I cannot reproduce in migration-next nor in the detached HEAD that the
>> >>> pipeline ran in (had to download the tarball from gitlab).
>> >>>
>> >>> The only s390 patch in this PR is one that I can test just fine with
>> >>> TCG, so there shouldn't be any difference from KVM (i.e. there should be
>> >>> no state being migrated with KVM that is not already migrated with TCG).
>> >>>
>> >>>> warning: fd: migration to a file is deprecated. Use file: instead.
>> >>>> warning: fd: migration to a file is deprecated. Use file: instead.
>> >>>
>> >>> This is harmless.
>> >>>
>> >>>> **
>> >>>> ERROR:../tests/qtest/migration-test.c:36:main: assertion failed (ret == 0): (1 == 0)
>> >>>> (test program exited with status code -6)
>> >>>
>> >>> This is the assert at the end of the tests, irrelevant.
>> >>>
>> >>>>
>> >>>> https://gitlab.com/qemu-project/qemu/-/jobs/8681858344#L8190
>> >>>>
>> >>>> If you find this pull request caused the failure, please send a new
>> >>>> revision. Otherwise please let me know so we can continue to
>> >>>> investigate.
>> >>>
>> >>> I don't have an s390x host at hand so the only thing I can to is to drop
>> >>> that patch and hope that resolves the problem. @Peter, @Thomas, any
>> >>> other ideas? Can you verify this on your end?
>> >>
>> >> Cannot reproduce either here, x86_64 host only.  The report was from s390
>> >> host, though.  I'm not familiar with the s390 patch, I wonder if any of you
>> >> could use plain brain power to figure more things out.
>> >>
>> >> We could wait for 1-2 more days to see whether Thomas can figure it out,
>> >> hopefully easily reproduceable on s390.. or we can also leave that for
>> >> later.  And if the current issue on such fix is s390-host-only, might be
>> >> easier to be picked up by s390 tree, perhaps?
>> >
>> > I tested migration-20241217-pull-request on a s390x (RHEL) host, but I
>> > cannot reproduce the issue there - make check-qtest works without any
>> > problems. Is it maybe related to that specific Ubuntu installation?
>> >
>>
>> Since we cannot reproduce outside of the staging CI, could we run that
>> job again with a diagnostic patch? Here's the rebased PR with the patch:
>>
>> https://gitlab.com/farosas/qemu/-/commits/migration-next
>>
>> (fork CI run: https://gitlab.com/farosas/qemu/-/pipelines/1610691202)
>>
>> Or should I just send a v2 of this PR with the debug patch?
>
> Here is the staging CI pipeline for your migration-next tree:
> https://gitlab.com/qemu-project/qemu/-/pipelines/1610836485

Great, thanks! Let's find out what is going on...

>
> Stefan


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PULL 00/17] Migration patches for 2024-12-17
  2025-01-03 21:00             ` Fabiano Rosas
@ 2025-01-03 22:34               ` Fabiano Rosas
  2025-01-06 18:45                 ` Peter Xu
  0 siblings, 1 reply; 29+ messages in thread
From: Fabiano Rosas @ 2025-01-03 22:34 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Thomas Huth, Peter Xu, Stefan Hajnoczi, qemu-devel

Fabiano Rosas <farosas@suse.de> writes:

> Stefan Hajnoczi <stefanha@gmail.com> writes:
>
>> On Fri, 3 Jan 2025 at 13:32, Fabiano Rosas <farosas@suse.de> wrote:
>>>
>>> Thomas Huth <thuth@redhat.com> writes:
>>>
>>> > On 20/12/2024 17.28, Peter Xu wrote:
>>> >> On Thu, Dec 19, 2024 at 03:53:22PM -0300, Fabiano Rosas wrote:
>>> >>> Stefan Hajnoczi <stefanha@redhat.com> writes:
>>> >>>
>>> >>>> Hi Fabiano,
>>> >>>> Please take a look at this CI failure:
>>> >>>>
>>> >>>>>>> MALLOC_PERTURB_=61 QTEST_QEMU_BINARY=./qemu-system-s390x UBSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 QTEST_QEMU_IMG=./qemu-img MESON_TEST_ITERATION=1 MSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 ASAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1 PYTHON=/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/pyvenv/bin/python3 QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon G_TEST_DBUS_DAEMON=/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/tests/dbus-vmstate-daemon.sh /home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/tests/qtest/migration-test --tap -k
>>> >>>> ――――――――――――――――――――――――――――――――――――― ✀  ―――――――――――――――――――――――――――――――――――――
>>> >>>> stderr:
>>> >>>> Traceback (most recent call last):
>>> >>>>    File "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py", line 688, in <module>
>>> >>>>      dump.read(dump_memory = args.memory)
>>> >>>>    File "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py", line 625, in read
>>> >>>>      section.read()
>>> >>>>    File "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py", line 461, in read
>>> >>>>      field['data'] = reader(field, self.file)
>>> >>>>    File "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py", line 434, in __init__
>>> >>>>      for field in self.desc['struct']['fields']:
>>> >>>> KeyError: 'fields'
>>> >>>
>>> >>> This is the command line that runs only this specific test:
>>> >>>
>>> >>> PYTHON=/usr/bin/python3.11 QTEST_QEMU_BINARY=./qemu-system-s390x
>>> >>> ./tests/qtest/migration-test -p /s390x/migration/analyze-script
>>> >>>
>>> >>> I cannot reproduce in migration-next nor in the detached HEAD that the
>>> >>> pipeline ran in (had to download the tarball from gitlab).
>>> >>>
>>> >>> The only s390 patch in this PR is one that I can test just fine with
>>> >>> TCG, so there shouldn't be any difference from KVM (i.e. there should be
>>> >>> no state being migrated with KVM that is not already migrated with TCG).
>>> >>>
>>> >>>> warning: fd: migration to a file is deprecated. Use file: instead.
>>> >>>> warning: fd: migration to a file is deprecated. Use file: instead.
>>> >>>
>>> >>> This is harmless.
>>> >>>
>>> >>>> **
>>> >>>> ERROR:../tests/qtest/migration-test.c:36:main: assertion failed (ret == 0): (1 == 0)
>>> >>>> (test program exited with status code -6)
>>> >>>
>>> >>> This is the assert at the end of the tests, irrelevant.
>>> >>>
>>> >>>>
>>> >>>> https://gitlab.com/qemu-project/qemu/-/jobs/8681858344#L8190
>>> >>>>
>>> >>>> If you find this pull request caused the failure, please send a new
>>> >>>> revision. Otherwise please let me know so we can continue to
>>> >>>> investigate.
>>> >>>
>>> >>> I don't have an s390x host at hand so the only thing I can to is to drop
>>> >>> that patch and hope that resolves the problem. @Peter, @Thomas, any
>>> >>> other ideas? Can you verify this on your end?
>>> >>
>>> >> Cannot reproduce either here, x86_64 host only.  The report was from s390
>>> >> host, though.  I'm not familiar with the s390 patch, I wonder if any of you
>>> >> could use plain brain power to figure more things out.
>>> >>
>>> >> We could wait for 1-2 more days to see whether Thomas can figure it out,
>>> >> hopefully easily reproduceable on s390.. or we can also leave that for
>>> >> later.  And if the current issue on such fix is s390-host-only, might be
>>> >> easier to be picked up by s390 tree, perhaps?
>>> >
>>> > I tested migration-20241217-pull-request on a s390x (RHEL) host, but I
>>> > cannot reproduce the issue there - make check-qtest works without any
>>> > problems. Is it maybe related to that specific Ubuntu installation?
>>> >
>>>
>>> Since we cannot reproduce outside of the staging CI, could we run that
>>> job again with a diagnostic patch? Here's the rebased PR with the patch:
>>>
>>> https://gitlab.com/farosas/qemu/-/commits/migration-next
>>>
>>> (fork CI run: https://gitlab.com/farosas/qemu/-/pipelines/1610691202)
>>>
>>> Or should I just send a v2 of this PR with the debug patch?
>>
>> Here is the staging CI pipeline for your migration-next tree:
>> https://gitlab.com/qemu-project/qemu/-/pipelines/1610836485
>
> Great, thanks! Let's find out what is going on...
>

It seems the issue is here:

{"name": "css", "array_len": 256, "type": "struct", "struct": {}, "size": 1}
                                                              ^
And in QEMU:

static const VMStateDescription vmstate_css = {
    .name = "s390_css",
    ...
->      VMSTATE_ARRAY_OF_POINTER_TO_STRUCT(css, ChannelSubSys, MAX_CSSID + 1,
                0, vmstate_css_img, CssImage),

Is it legal to have an empty array? I would assume so. Are we maybe
missing a .needed?

Comparing with another similar vmstate spapr_llan/rx_pools in ppc
(-device spapr-vlan), what I see is:

{"name": "rx_pool", "array_len": 5, "type": "struct", "struct":
{"vmsd_name": "spapr_llan/rx_buffer_pool", ... }, "size": 32776}

So for CSS I'd expect:

-{"name": "css", "array_len": 256, "type": "struct", "struct": {}, "size": 1}
+{"name": "css", "array_len": 256, "type": "struct", "struct": {"vmsd_name": "s390_css_img", ...}, "size": 1}

What is weird is that in my TCG run it also shows the empty struct and
the script doesn't seem to care. For some reason, in the CI job it
parses further into the JSON.

If anyone spots something, let me know. I'll get back to this on Monday
with a fresh mind.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PULL 00/17] Migration patches for 2024-12-17
  2025-01-03 22:34               ` Fabiano Rosas
@ 2025-01-06 18:45                 ` Peter Xu
  2025-01-06 19:24                   ` Fabiano Rosas
  0 siblings, 1 reply; 29+ messages in thread
From: Peter Xu @ 2025-01-06 18:45 UTC (permalink / raw)
  To: Fabiano Rosas; +Cc: Stefan Hajnoczi, Thomas Huth, Stefan Hajnoczi, qemu-devel

On Fri, Jan 03, 2025 at 07:34:08PM -0300, Fabiano Rosas wrote:
> Fabiano Rosas <farosas@suse.de> writes:
> 
> > Stefan Hajnoczi <stefanha@gmail.com> writes:
> >
> >> On Fri, 3 Jan 2025 at 13:32, Fabiano Rosas <farosas@suse.de> wrote:
> >>>
> >>> Thomas Huth <thuth@redhat.com> writes:
> >>>
> >>> > On 20/12/2024 17.28, Peter Xu wrote:
> >>> >> On Thu, Dec 19, 2024 at 03:53:22PM -0300, Fabiano Rosas wrote:
> >>> >>> Stefan Hajnoczi <stefanha@redhat.com> writes:
> >>> >>>
> >>> >>>> Hi Fabiano,
> >>> >>>> Please take a look at this CI failure:
> >>> >>>>
> >>> >>>>>>> MALLOC_PERTURB_=61 QTEST_QEMU_BINARY=./qemu-system-s390x UBSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 QTEST_QEMU_IMG=./qemu-img MESON_TEST_ITERATION=1 MSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 ASAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1 PYTHON=/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/pyvenv/bin/python3 QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon G_TEST_DBUS_DAEMON=/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/tests/dbus-vmstate-daemon.sh /home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/tests/qtest/migration-test --tap -k
> >>> >>>> ――――――――――――――――――――――――――――――――――――― ✀  ―――――――――――――――――――――――――――――――――――――
> >>> >>>> stderr:
> >>> >>>> Traceback (most recent call last):
> >>> >>>>    File "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py", line 688, in <module>
> >>> >>>>      dump.read(dump_memory = args.memory)
> >>> >>>>    File "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py", line 625, in read
> >>> >>>>      section.read()
> >>> >>>>    File "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py", line 461, in read
> >>> >>>>      field['data'] = reader(field, self.file)
> >>> >>>>    File "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py", line 434, in __init__
> >>> >>>>      for field in self.desc['struct']['fields']:
> >>> >>>> KeyError: 'fields'
> >>> >>>
> >>> >>> This is the command line that runs only this specific test:
> >>> >>>
> >>> >>> PYTHON=/usr/bin/python3.11 QTEST_QEMU_BINARY=./qemu-system-s390x
> >>> >>> ./tests/qtest/migration-test -p /s390x/migration/analyze-script
> >>> >>>
> >>> >>> I cannot reproduce in migration-next nor in the detached HEAD that the
> >>> >>> pipeline ran in (had to download the tarball from gitlab).
> >>> >>>
> >>> >>> The only s390 patch in this PR is one that I can test just fine with
> >>> >>> TCG, so there shouldn't be any difference from KVM (i.e. there should be
> >>> >>> no state being migrated with KVM that is not already migrated with TCG).
> >>> >>>
> >>> >>>> warning: fd: migration to a file is deprecated. Use file: instead.
> >>> >>>> warning: fd: migration to a file is deprecated. Use file: instead.
> >>> >>>
> >>> >>> This is harmless.
> >>> >>>
> >>> >>>> **
> >>> >>>> ERROR:../tests/qtest/migration-test.c:36:main: assertion failed (ret == 0): (1 == 0)
> >>> >>>> (test program exited with status code -6)
> >>> >>>
> >>> >>> This is the assert at the end of the tests, irrelevant.
> >>> >>>
> >>> >>>>
> >>> >>>> https://gitlab.com/qemu-project/qemu/-/jobs/8681858344#L8190
> >>> >>>>
> >>> >>>> If you find this pull request caused the failure, please send a new
> >>> >>>> revision. Otherwise please let me know so we can continue to
> >>> >>>> investigate.
> >>> >>>
> >>> >>> I don't have an s390x host at hand so the only thing I can to is to drop
> >>> >>> that patch and hope that resolves the problem. @Peter, @Thomas, any
> >>> >>> other ideas? Can you verify this on your end?
> >>> >>
> >>> >> Cannot reproduce either here, x86_64 host only.  The report was from s390
> >>> >> host, though.  I'm not familiar with the s390 patch, I wonder if any of you
> >>> >> could use plain brain power to figure more things out.
> >>> >>
> >>> >> We could wait for 1-2 more days to see whether Thomas can figure it out,
> >>> >> hopefully easily reproduceable on s390.. or we can also leave that for
> >>> >> later.  And if the current issue on such fix is s390-host-only, might be
> >>> >> easier to be picked up by s390 tree, perhaps?
> >>> >
> >>> > I tested migration-20241217-pull-request on a s390x (RHEL) host, but I
> >>> > cannot reproduce the issue there - make check-qtest works without any
> >>> > problems. Is it maybe related to that specific Ubuntu installation?
> >>> >
> >>>
> >>> Since we cannot reproduce outside of the staging CI, could we run that
> >>> job again with a diagnostic patch? Here's the rebased PR with the patch:
> >>>
> >>> https://gitlab.com/farosas/qemu/-/commits/migration-next
> >>>
> >>> (fork CI run: https://gitlab.com/farosas/qemu/-/pipelines/1610691202)
> >>>
> >>> Or should I just send a v2 of this PR with the debug patch?
> >>
> >> Here is the staging CI pipeline for your migration-next tree:
> >> https://gitlab.com/qemu-project/qemu/-/pipelines/1610836485
> >
> > Great, thanks! Let's find out what is going on...
> >
> 
> It seems the issue is here:
> 
> {"name": "css", "array_len": 256, "type": "struct", "struct": {}, "size": 1}
>                                                               ^
> And in QEMU:
> 
> static const VMStateDescription vmstate_css = {
>     .name = "s390_css",
>     ...
> ->      VMSTATE_ARRAY_OF_POINTER_TO_STRUCT(css, ChannelSubSys, MAX_CSSID + 1,
>                 0, vmstate_css_img, CssImage),
> 
> Is it legal to have an empty array? I would assume so. Are we maybe
> missing a .needed?

I guess we can always decide to dump things even if it's empty.

When I was looking at this, I saw a trick we played in vmstate dump, see
07d4e69147 ("migration/vmstate: fix array of ptr with nullptrs").  I am
guessing we hit a nullptr (or a bunch of..) here so the JSON part is
ignored.

> 
> Comparing with another similar vmstate spapr_llan/rx_pools in ppc
> (-device spapr-vlan), what I see is:
> 
> {"name": "rx_pool", "array_len": 5, "type": "struct", "struct":
> {"vmsd_name": "spapr_llan/rx_buffer_pool", ... }, "size": 32776}
> 
> So for CSS I'd expect:
> 
> -{"name": "css", "array_len": 256, "type": "struct", "struct": {}, "size": 1}
> +{"name": "css", "array_len": 256, "type": "struct", "struct": {"vmsd_name": "s390_css_img", ...}, "size": 1}
> 
> What is weird is that in my TCG run it also shows the empty struct and
> the script doesn't seem to care. For some reason, in the CI job it
> parses further into the JSON.
> 
> If anyone spots something, let me know. I'll get back to this on Monday
> with a fresh mind.

So I thought about a solution; it's not easy to do it clean in a small
change.  So here it is, not so small but not huge either.  This is the
cleanest I can come up with.. attached at the end.

If it works, we're 100% lucky.  I hope VMSDFieldGeneric in the script will
already work for the nullptrs.  If not, hopefully this provides some
insight so you can move further..

===8<===
From e5339d55f71df2d96d99dbd7eb845f06da0e68aa Mon Sep 17 00:00:00 2001
From: Peter Xu <peterx@redhat.com>
Date: Mon, 6 Jan 2025 13:18:25 -0500
Subject: [PATCH] migration: Dump correct JSON format for nullptr replacement

QEMU plays a trick with null pointers inside an array of pointers in a VMSD
field.  See 07d4e69147 ("migration/vmstate: fix array of ptr with
nullptrs") for more details on why.  The idea makes sense in general, but
it may overlooked the JSON writer where it could write nothing in a
"struct" in the JSON hints section.

We hit some analyze-migration.py issues on s390 recently, showing that some
of the struct field contains nothing, like:

{"name": "css", "array_len": 256, "type": "struct", "struct": {}, "size": 1}

As described in details by Fabiano:

https://lore.kernel.org/r/87pll37cin.fsf@suse.de

It could be that we hit some null pointers there, and JSON was gone when
they're null pointers.

To fix it, instead of hacking around only at VMStateInfo level, do that
from VMStateField level, so that JSON writer can also be involved.  In this
case, JSON writer will replace the pointer array (which used to be a
"struct") to be the real representation of the nullptr field.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/vmstate.c | 118 ++++++++++++++++++++++++++++++++++----------
 1 file changed, 91 insertions(+), 27 deletions(-)

diff --git a/migration/vmstate.c b/migration/vmstate.c
index fa002b24e8..e9321b7846 100644
--- a/migration/vmstate.c
+++ b/migration/vmstate.c
@@ -51,6 +51,36 @@ vmstate_field_exists(const VMStateDescription *vmsd, const VMStateField *field,
     return result;
 }
 
+/*
+ * Create a fake nullptr field when there's a NULL pointer detected in the
+ * array of a VMS_ARRAY_OF_POINTER VMSD field.  It's needed because we
+ * can't dereference the NULL pointer.
+ */
+static const VMStateField *
+vmsd_create_fake_nullptr_field(const VMStateField *field)
+{
+    VMStateField *fake = g_new0(VMStateField, 1);
+
+    /* It can only happen on an array of pointers! */
+    assert(field->flags & VMS_ARRAY_OF_POINTER);
+
+    /* Some of fake's properties should match the original's */
+    fake->name = field->name;
+    fake->version_id = field->version_id;
+
+    /* Do not need "field_exists" check as it always exists (which is null) */
+    fake->field_exists = NULL;
+
+    /* See vmstate_info_nullptr - use 1 byte to represent nullptr */
+    fake->size = 1;
+    fake->info = &vmstate_info_nullptr;
+    fake->flags = VMS_SINGLE;
+
+    /* All the rest fields shouldn't matter.. */
+
+    return (const VMStateField *)fake;
+}
+
 static int vmstate_n_elems(void *opaque, const VMStateField *field)
 {
     int n_elems = 1;
@@ -143,23 +173,39 @@ int vmstate_load_state(QEMUFile *f, const VMStateDescription *vmsd,
             }
             for (i = 0; i < n_elems; i++) {
                 void *curr_elem = first_elem + size * i;
+                const VMStateField *inner_field;
 
                 if (field->flags & VMS_ARRAY_OF_POINTER) {
                     curr_elem = *(void **)curr_elem;
                 }
+
                 if (!curr_elem && size) {
-                    /* if null pointer check placeholder and do not follow */
-                    assert(field->flags & VMS_ARRAY_OF_POINTER);
-                    ret = vmstate_info_nullptr.get(f, curr_elem, size, NULL);
-                } else if (field->flags & VMS_STRUCT) {
-                    ret = vmstate_load_state(f, field->vmsd, curr_elem,
-                                             field->vmsd->version_id);
-                } else if (field->flags & VMS_VSTRUCT) {
-                    ret = vmstate_load_state(f, field->vmsd, curr_elem,
-                                             field->struct_version_id);
+                    /*
+                     * If null pointer found (which should only happen in
+                     * an array of pointers), use null placeholder and do
+                     * not follow.
+                     */
+                    inner_field = vmsd_create_fake_nullptr_field(field);
+                } else {
+                    inner_field = field;
+                }
+
+                if (inner_field->flags & VMS_STRUCT) {
+                    ret = vmstate_load_state(f, inner_field->vmsd, curr_elem,
+                                             inner_field->vmsd->version_id);
+                } else if (inner_field->flags & VMS_VSTRUCT) {
+                    ret = vmstate_load_state(f, inner_field->vmsd, curr_elem,
+                                             inner_field->struct_version_id);
                 } else {
-                    ret = field->info->get(f, curr_elem, size, field);
+                    ret = inner_field->info->get(f, curr_elem, size,
+                                                 inner_field);
                 }
+
+                /* If we used a fake temp field.. free it now */
+                if (inner_field != field) {
+                    g_clear_pointer((gpointer *)&inner_field, g_free);
+                }
+
                 if (ret >= 0) {
                     ret = qemu_file_get_error(f);
                 }
@@ -387,29 +433,50 @@ int vmstate_save_state_v(QEMUFile *f, const VMStateDescription *vmsd,
             }
             for (i = 0; i < n_elems; i++) {
                 void *curr_elem = first_elem + size * i;
+                const VMStateField *inner_field;
 
-                vmsd_desc_field_start(vmsd, vmdesc_loop, field, i, n_elems);
                 old_offset = qemu_file_transferred(f);
                 if (field->flags & VMS_ARRAY_OF_POINTER) {
                     assert(curr_elem);
                     curr_elem = *(void **)curr_elem;
                 }
+
                 if (!curr_elem && size) {
-                    /* if null pointer write placeholder and do not follow */
-                    assert(field->flags & VMS_ARRAY_OF_POINTER);
-                    ret = vmstate_info_nullptr.put(f, curr_elem, size, NULL,
-                                                   NULL);
-                } else if (field->flags & VMS_STRUCT) {
-                    ret = vmstate_save_state(f, field->vmsd, curr_elem,
-                                             vmdesc_loop);
-                } else if (field->flags & VMS_VSTRUCT) {
-                    ret = vmstate_save_state_v(f, field->vmsd, curr_elem,
-                                               vmdesc_loop,
-                                               field->struct_version_id, errp);
+                    /*
+                     * If null pointer found (which should only happen in
+                     * an array of pointers), use null placeholder and do
+                     * not follow.
+                     */
+                    inner_field = vmsd_create_fake_nullptr_field(field);
+                } else {
+                    inner_field = field;
+                }
+
+                vmsd_desc_field_start(vmsd, vmdesc_loop, inner_field,
+                                      i, n_elems);
+
+                if (inner_field->flags & VMS_STRUCT) {
+                    ret = vmstate_save_state(f, inner_field->vmsd,
+                                             curr_elem, vmdesc_loop);
+                } else if (inner_field->flags & VMS_VSTRUCT) {
+                    ret = vmstate_save_state_v(f, inner_field->vmsd,
+                                               curr_elem, vmdesc_loop,
+                                               inner_field->struct_version_id,
+                                               errp);
                 } else {
-                    ret = field->info->put(f, curr_elem, size, field,
-                                     vmdesc_loop);
+                    ret = inner_field->info->put(f, curr_elem, size,
+                                                 inner_field, vmdesc_loop);
                 }
+
+                written_bytes = qemu_file_transferred(f) - old_offset;
+                vmsd_desc_field_end(vmsd, vmdesc_loop, inner_field,
+                                    written_bytes, i);
+
+                /* If we used a fake temp field.. free it now */
+                if (inner_field != field) {
+                    g_clear_pointer((gpointer *)&inner_field, g_free);
+                }
+
                 if (ret) {
                     error_setg(errp, "Save of field %s/%s failed",
                                 vmsd->name, field->name);
@@ -419,9 +486,6 @@ int vmstate_save_state_v(QEMUFile *f, const VMStateDescription *vmsd,
                     return ret;
                 }
 
-                written_bytes = qemu_file_transferred(f) - old_offset;
-                vmsd_desc_field_end(vmsd, vmdesc_loop, field, written_bytes, i);
-
                 /* Compressed arrays only care about the first element */
                 if (vmdesc_loop && vmsd_can_compress(field)) {
                     vmdesc_loop = NULL;
-- 
2.47.0

===8<===

-- 
Peter Xu



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PULL 00/17] Migration patches for 2024-12-17
  2025-01-06 18:45                 ` Peter Xu
@ 2025-01-06 19:24                   ` Fabiano Rosas
  2025-01-06 20:22                     ` Peter Xu
  0 siblings, 1 reply; 29+ messages in thread
From: Fabiano Rosas @ 2025-01-06 19:24 UTC (permalink / raw)
  To: Peter Xu; +Cc: Stefan Hajnoczi, Thomas Huth, Stefan Hajnoczi, qemu-devel

Peter Xu <peterx@redhat.com> writes:

> On Fri, Jan 03, 2025 at 07:34:08PM -0300, Fabiano Rosas wrote:
>> Fabiano Rosas <farosas@suse.de> writes:
>> 
>> > Stefan Hajnoczi <stefanha@gmail.com> writes:
>> >
>> >> On Fri, 3 Jan 2025 at 13:32, Fabiano Rosas <farosas@suse.de> wrote:
>> >>>
>> >>> Thomas Huth <thuth@redhat.com> writes:
>> >>>
>> >>> > On 20/12/2024 17.28, Peter Xu wrote:
>> >>> >> On Thu, Dec 19, 2024 at 03:53:22PM -0300, Fabiano Rosas wrote:
>> >>> >>> Stefan Hajnoczi <stefanha@redhat.com> writes:
>> >>> >>>
>> >>> >>>> Hi Fabiano,
>> >>> >>>> Please take a look at this CI failure:
>> >>> >>>>
>> >>> >>>>>>> MALLOC_PERTURB_=61 QTEST_QEMU_BINARY=./qemu-system-s390x UBSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 QTEST_QEMU_IMG=./qemu-img MESON_TEST_ITERATION=1 MSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 ASAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1 PYTHON=/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/pyvenv/bin/python3 QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon G_TEST_DBUS_DAEMON=/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/tests/dbus-vmstate-daemon.sh /home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/tests/qtest/migration-test --tap -k
>> >>> >>>> ――――――――――――――――――――――――――――――――――――― ✀  ―――――――――――――――――――――――――――――――――――――
>> >>> >>>> stderr:
>> >>> >>>> Traceback (most recent call last):
>> >>> >>>>    File "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py", line 688, in <module>
>> >>> >>>>      dump.read(dump_memory = args.memory)
>> >>> >>>>    File "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py", line 625, in read
>> >>> >>>>      section.read()
>> >>> >>>>    File "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py", line 461, in read
>> >>> >>>>      field['data'] = reader(field, self.file)
>> >>> >>>>    File "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py", line 434, in __init__
>> >>> >>>>      for field in self.desc['struct']['fields']:
>> >>> >>>> KeyError: 'fields'
>> >>> >>>
>> >>> >>> This is the command line that runs only this specific test:
>> >>> >>>
>> >>> >>> PYTHON=/usr/bin/python3.11 QTEST_QEMU_BINARY=./qemu-system-s390x
>> >>> >>> ./tests/qtest/migration-test -p /s390x/migration/analyze-script
>> >>> >>>
>> >>> >>> I cannot reproduce in migration-next nor in the detached HEAD that the
>> >>> >>> pipeline ran in (had to download the tarball from gitlab).
>> >>> >>>
>> >>> >>> The only s390 patch in this PR is one that I can test just fine with
>> >>> >>> TCG, so there shouldn't be any difference from KVM (i.e. there should be
>> >>> >>> no state being migrated with KVM that is not already migrated with TCG).
>> >>> >>>
>> >>> >>>> warning: fd: migration to a file is deprecated. Use file: instead.
>> >>> >>>> warning: fd: migration to a file is deprecated. Use file: instead.
>> >>> >>>
>> >>> >>> This is harmless.
>> >>> >>>
>> >>> >>>> **
>> >>> >>>> ERROR:../tests/qtest/migration-test.c:36:main: assertion failed (ret == 0): (1 == 0)
>> >>> >>>> (test program exited with status code -6)
>> >>> >>>
>> >>> >>> This is the assert at the end of the tests, irrelevant.
>> >>> >>>
>> >>> >>>>
>> >>> >>>> https://gitlab.com/qemu-project/qemu/-/jobs/8681858344#L8190
>> >>> >>>>
>> >>> >>>> If you find this pull request caused the failure, please send a new
>> >>> >>>> revision. Otherwise please let me know so we can continue to
>> >>> >>>> investigate.
>> >>> >>>
>> >>> >>> I don't have an s390x host at hand so the only thing I can to is to drop
>> >>> >>> that patch and hope that resolves the problem. @Peter, @Thomas, any
>> >>> >>> other ideas? Can you verify this on your end?
>> >>> >>
>> >>> >> Cannot reproduce either here, x86_64 host only.  The report was from s390
>> >>> >> host, though.  I'm not familiar with the s390 patch, I wonder if any of you
>> >>> >> could use plain brain power to figure more things out.
>> >>> >>
>> >>> >> We could wait for 1-2 more days to see whether Thomas can figure it out,
>> >>> >> hopefully easily reproduceable on s390.. or we can also leave that for
>> >>> >> later.  And if the current issue on such fix is s390-host-only, might be
>> >>> >> easier to be picked up by s390 tree, perhaps?
>> >>> >
>> >>> > I tested migration-20241217-pull-request on a s390x (RHEL) host, but I
>> >>> > cannot reproduce the issue there - make check-qtest works without any
>> >>> > problems. Is it maybe related to that specific Ubuntu installation?
>> >>> >
>> >>>
>> >>> Since we cannot reproduce outside of the staging CI, could we run that
>> >>> job again with a diagnostic patch? Here's the rebased PR with the patch:
>> >>>
>> >>> https://gitlab.com/farosas/qemu/-/commits/migration-next
>> >>>
>> >>> (fork CI run: https://gitlab.com/farosas/qemu/-/pipelines/1610691202)
>> >>>
>> >>> Or should I just send a v2 of this PR with the debug patch?
>> >>
>> >> Here is the staging CI pipeline for your migration-next tree:
>> >> https://gitlab.com/qemu-project/qemu/-/pipelines/1610836485
>> >
>> > Great, thanks! Let's find out what is going on...
>> >
>> 
>> It seems the issue is here:
>> 
>> {"name": "css", "array_len": 256, "type": "struct", "struct": {}, "size": 1}
>>                                                               ^
>> And in QEMU:
>> 
>> static const VMStateDescription vmstate_css = {
>>     .name = "s390_css",
>>     ...
>> ->      VMSTATE_ARRAY_OF_POINTER_TO_STRUCT(css, ChannelSubSys, MAX_CSSID + 1,
>>                 0, vmstate_css_img, CssImage),
>> 
>> Is it legal to have an empty array? I would assume so. Are we maybe
>> missing a .needed?
>
> I guess we can always decide to dump things even if it's empty.
>
> When I was looking at this, I saw a trick we played in vmstate dump, see
> 07d4e69147 ("migration/vmstate: fix array of ptr with nullptrs").  I am
> guessing we hit a nullptr (or a bunch of..) here so the JSON part is
> ignored.
>
>> 
>> Comparing with another similar vmstate spapr_llan/rx_pools in ppc
>> (-device spapr-vlan), what I see is:
>> 
>> {"name": "rx_pool", "array_len": 5, "type": "struct", "struct":
>> {"vmsd_name": "spapr_llan/rx_buffer_pool", ... }, "size": 32776}
>> 
>> So for CSS I'd expect:
>> 
>> -{"name": "css", "array_len": 256, "type": "struct", "struct": {}, "size": 1}
>> +{"name": "css", "array_len": 256, "type": "struct", "struct": {"vmsd_name": "s390_css_img", ...}, "size": 1}
>> 
>> What is weird is that in my TCG run it also shows the empty struct and
>> the script doesn't seem to care. For some reason, in the CI job it
>> parses further into the JSON.
>> 
>> If anyone spots something, let me know. I'll get back to this on Monday
>> with a fresh mind.
>

Hi, Peter

We already spoke on IRC, but so everyone is in the same page:

The analyze-migration.py script is broken for s390x even in
master. That's why* we cannot reproduce this issue in our local
setups. The s390-storage_attributes section is failing to parse the last
STATTR_FLAG_EOS, which is a u64 0x1 that the generic code then reads a
byte from and sees 0x0 == QEMU_VM_EOF.

*- yes, this doesn't account for the s390 host that Thomas used
   which didn't reproduce the issue, but still...

The patch is here and I'll include it at the end of the email as well:

https://gitlab.com/farosas/qemu/-/commit/5bcad03aad85556a7b72f79d3574e246a99432c3.patch

> So I thought about a solution; it's not easy to do it clean in a small
> change.  So here it is, not so small but not huge either.  This is the
> cleanest I can come up with.. attached at the end.
>
> If it works, we're 100% lucky.  I hope VMSDFieldGeneric in the script will
> already work for the nullptrs.  If not, hopefully this provides some
> insight so you can move further..
>
> ===8<===
> From e5339d55f71df2d96d99dbd7eb845f06da0e68aa Mon Sep 17 00:00:00 2001
> From: Peter Xu <peterx@redhat.com>
> Date: Mon, 6 Jan 2025 13:18:25 -0500
> Subject: [PATCH] migration: Dump correct JSON format for nullptr replacement
>
> QEMU plays a trick with null pointers inside an array of pointers in a VMSD
> field.  See 07d4e69147 ("migration/vmstate: fix array of ptr with
> nullptrs") for more details on why.  The idea makes sense in general, but
> it may overlooked the JSON writer where it could write nothing in a
> "struct" in the JSON hints section.

Interesting, I didn't know about that. I'm indeed seeing some stray "48"
('0') now in the stream. I'll give your patch a try.

Here's the fix for the pre-existing issue in the script:

-- 8< --
From 5bcad03aad85556a7b72f79d3574e246a99432c3 Mon Sep 17 00:00:00 2001
From: Fabiano Rosas <farosas@suse.de>
Date: Mon, 6 Jan 2025 15:05:31 -0300
Subject: [PATCH 1/2] migration: Fix parsing of s390 stream

The parsing for the S390StorageAttributes section is currently leaving
an unconsumed token that is later interpreted by the generic code as
QEMU_VM_EOF, cutting the parsing short.

The migration will issue a STATTR_FLAG_DONE between iterations, but
there's a final STATTR_FLAG_EOS at .save_complete.

Fixes: 81c2c9dd5d ("tests/qtest/migration-test: Fix analyze-migration.py for s390x")
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 scripts/analyze-migration.py | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/scripts/analyze-migration.py b/scripts/analyze-migration.py
index f2457b1dde..2a2160cbf7 100755
--- a/scripts/analyze-migration.py
+++ b/scripts/analyze-migration.py
@@ -65,6 +65,9 @@ def readvar(self, size = None):
     def tell(self):
         return self.file.tell()
 
+    def seek(self, a, b):
+        return self.file.seek(a, b)
+
     # The VMSD description is at the end of the file, after EOF. Look for
     # the last NULL byte, then for the beginning brace of JSON.
     def read_migration_debug_json(self):
@@ -272,11 +275,24 @@ def __init__(self, file, version_id, device, section_key):
         self.section_key = section_key
 
     def read(self):
+        pos = 0
         while True:
             addr_flags = self.file.read64()
             flags = addr_flags & 0xfff
-            if (flags & (self.STATTR_FLAG_DONE | self.STATTR_FLAG_EOS)):
+
+            if flags & self.STATTR_FLAG_DONE:
+                pos = self.file.tell()
+                continue
+            elif flags & self.STATTR_FLAG_EOS:
                 return
+            else:
+                # No EOS came after DONE, that's OK, but rewind the
+                # stream because this is not our data.
+                if pos:
+                    self.file.seek(pos, 0)
+                    return
+                raise Exception("Unknown flags %x", flags)
+
             if (flags & self.STATTR_FLAG_ERROR):
                 raise Exception("Error in migration stream")
             count = self.file.read64()
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PULL 00/17] Migration patches for 2024-12-17
  2025-01-06 19:24                   ` Fabiano Rosas
@ 2025-01-06 20:22                     ` Peter Xu
  0 siblings, 0 replies; 29+ messages in thread
From: Peter Xu @ 2025-01-06 20:22 UTC (permalink / raw)
  To: Fabiano Rosas; +Cc: Stefan Hajnoczi, Thomas Huth, Stefan Hajnoczi, qemu-devel

On Mon, Jan 06, 2025 at 04:24:53PM -0300, Fabiano Rosas wrote:
> Here's the fix for the pre-existing issue in the script:

For this patch:

> 
> -- 8< --
> From 5bcad03aad85556a7b72f79d3574e246a99432c3 Mon Sep 17 00:00:00 2001
> From: Fabiano Rosas <farosas@suse.de>
> Date: Mon, 6 Jan 2025 15:05:31 -0300
> Subject: [PATCH 1/2] migration: Fix parsing of s390 stream
> 
> The parsing for the S390StorageAttributes section is currently leaving
> an unconsumed token that is later interpreted by the generic code as
> QEMU_VM_EOF, cutting the parsing short.

Better mention why it can be intepreted as QEMU_VM_EOF, especially s390's
tag is 8 bytes while QEMU's EOF is 1 byte.. that confused me. :)

> 
> The migration will issue a STATTR_FLAG_DONE between iterations, but
> there's a final STATTR_FLAG_EOS at .save_complete.
> 
> Fixes: 81c2c9dd5d ("tests/qtest/migration-test: Fix analyze-migration.py for s390x")
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> ---
>  scripts/analyze-migration.py | 18 +++++++++++++++++-
>  1 file changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/scripts/analyze-migration.py b/scripts/analyze-migration.py
> index f2457b1dde..2a2160cbf7 100755
> --- a/scripts/analyze-migration.py
> +++ b/scripts/analyze-migration.py
> @@ -65,6 +65,9 @@ def readvar(self, size = None):
>      def tell(self):
>          return self.file.tell()
>  
> +    def seek(self, a, b):
> +        return self.file.seek(a, b)
> +
>      # The VMSD description is at the end of the file, after EOF. Look for
>      # the last NULL byte, then for the beginning brace of JSON.
>      def read_migration_debug_json(self):
> @@ -272,11 +275,24 @@ def __init__(self, file, version_id, device, section_key):
>          self.section_key = section_key
>  
>      def read(self):
> +        pos = 0
>          while True:
>              addr_flags = self.file.read64()
>              flags = addr_flags & 0xfff
> -            if (flags & (self.STATTR_FLAG_DONE | self.STATTR_FLAG_EOS)):
> +
> +            if flags & self.STATTR_FLAG_DONE:
> +                pos = self.file.tell()
> +                continue
> +            elif flags & self.STATTR_FLAG_EOS:
>                  return
> +            else:
> +                # No EOS came after DONE, that's OK, but rewind the
> +                # stream because this is not our data.
> +                if pos:
> +                    self.file.seek(pos, 0)

Nit: use io.SEEK_SET.

> +                    return
> +                raise Exception("Unknown flags %x", flags)
> +
>              if (flags & self.STATTR_FLAG_ERROR):
>                  raise Exception("Error in migration stream")
>              count = self.file.read64()

With above:

Reviewed-by: Peter Xu <peterx@redhat.com>

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2025-01-06 20:23 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-17 17:48 [PULL 00/17] Migration patches for 2024-12-17 Fabiano Rosas
2024-12-17 17:48 ` [PULL 01/17] migration/multifd: Fix compile error caused by page_size usage Fabiano Rosas
2024-12-17 17:48 ` [PULL 02/17] migration/multifd: Further remove the SYNC on complete Fabiano Rosas
2024-12-17 17:48 ` [PULL 03/17] migration/multifd: Allow to sync with sender threads only Fabiano Rosas
2024-12-17 17:48 ` [PULL 04/17] migration/ram: Move RAM_SAVE_FLAG* into ram.h Fabiano Rosas
2024-12-17 17:48 ` [PULL 05/17] migration/multifd: Unify RAM_SAVE_FLAG_MULTIFD_FLUSH messages Fabiano Rosas
2024-12-17 17:48 ` [PULL 06/17] migration/multifd: Remove sync processing on postcopy Fabiano Rosas
2024-12-17 17:48 ` [PULL 07/17] migration/multifd: Cleanup src flushes on condition check Fabiano Rosas
2024-12-17 17:48 ` [PULL 08/17] migration/multifd: Document the reason to sync for save_setup() Fabiano Rosas
2024-12-17 17:48 ` [PULL 09/17] migration/multifd: Fix compat with QEMU < 9.0 Fabiano Rosas
2024-12-17 17:48 ` [PULL 10/17] s390x: Fix CSS migration Fabiano Rosas
2024-12-17 17:48 ` [PULL 11/17] migration: Add helper to get target runstate Fabiano Rosas
2024-12-17 17:48 ` [PULL 12/17] qmp/cont: Only activate disks if migration completed Fabiano Rosas
2024-12-17 17:48 ` [PULL 13/17] migration/block: Make late-block-active the default Fabiano Rosas
2024-12-17 17:48 ` [PULL 14/17] migration/block: Apply late-block-active behavior to postcopy Fabiano Rosas
2024-12-17 17:48 ` [PULL 15/17] migration/block: Fix possible race with block_inactive Fabiano Rosas
2024-12-17 17:48 ` [PULL 16/17] migration/block: Rewrite disk activation Fabiano Rosas
2024-12-17 17:48 ` [PULL 17/17] tests/qtest/migration: Fix compile errors when CONFIG_UADK is set Fabiano Rosas
2024-12-19 12:32 ` [PULL 00/17] Migration patches for 2024-12-17 Stefan Hajnoczi
2024-12-19 18:53   ` Fabiano Rosas
2024-12-20 16:28     ` Peter Xu
2025-01-02  9:32       ` Thomas Huth
2025-01-03 18:30         ` Fabiano Rosas
2025-01-03 20:31           ` Stefan Hajnoczi
2025-01-03 21:00             ` Fabiano Rosas
2025-01-03 22:34               ` Fabiano Rosas
2025-01-06 18:45                 ` Peter Xu
2025-01-06 19:24                   ` Fabiano Rosas
2025-01-06 20:22                     ` Peter Xu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.