qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PULL 10/38] tests/qtest/migration: Add a test for the analyze-migration script
  2023-10-16 10:06 [PULL 00/38] Migration 20231016 patches Juan Quintela
@ 2023-10-16 10:06 ` Juan Quintela
  0 siblings, 0 replies; 49+ messages in thread
From: Juan Quintela @ 2023-10-16 10:06 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, Paolo Bonzini, Vladimir Sementsov-Ogievskiy,
	Fam Zheng, Cleber Rosa, Eric Blake, Li Zhijian, Peter Xu,
	Markus Armbruster, John Snow, Stefan Hajnoczi, Juan Quintela,
	Leonardo Bras, Laurent Vivier, Fabiano Rosas, Thomas Huth

From: Fabiano Rosas <farosas@suse.de>

Add a smoke test that migrates to a file and gives it to the
script. It should catch the most annoying errors such as changes in
the ram flags.

After code has been merged it becomes way harder to figure out what is
causing the script to fail, the person making the change is the most
likely to know right away what the problem is.

Signed-off-by: Fabiano Rosas <farosas@suse.de>
Acked-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231009184326.15777-7-farosas@suse.de>
---
 tests/qtest/migration-test.c | 60 ++++++++++++++++++++++++++++++++++++
 tests/qtest/meson.build      |  2 ++
 2 files changed, 62 insertions(+)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 8eb2053dbb..cef5081f8c 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -66,6 +66,8 @@ static bool got_dst_resume;
  */
 #define DIRTYLIMIT_TOLERANCE_RANGE  25  /* MB/s */
 
+#define ANALYZE_SCRIPT "scripts/analyze-migration.py"
+
 #if defined(__linux__)
 #include <sys/syscall.h>
 #include <sys/vfs.h>
@@ -1501,6 +1503,61 @@ static void test_baddest(void)
     test_migrate_end(from, to, false);
 }
 
+#ifndef _WIN32
+static void test_analyze_script(void)
+{
+    MigrateStart args = {
+        .opts_source = "-uuid 11111111-1111-1111-1111-111111111111",
+    };
+    QTestState *from, *to;
+    g_autofree char *uri = NULL;
+    g_autofree char *file = NULL;
+    int pid, wstatus;
+    const char *python = g_getenv("PYTHON");
+
+    if (!python) {
+        g_test_skip("PYTHON variable not set");
+        return;
+    }
+
+    /* dummy url */
+    if (test_migrate_start(&from, &to, "tcp:127.0.0.1:0", &args)) {
+        return;
+    }
+
+    /*
+     * Setting these two capabilities causes the "configuration"
+     * vmstate to include subsections for them. The script needs to
+     * parse those subsections properly.
+     */
+    migrate_set_capability(from, "validate-uuid", true);
+    migrate_set_capability(from, "x-ignore-shared", true);
+
+    file = g_strdup_printf("%s/migfile", tmpfs);
+    uri = g_strdup_printf("exec:cat > %s", file);
+
+    migrate_ensure_converge(from);
+    migrate_qmp(from, uri, "{}");
+    wait_for_migration_complete(from);
+
+    pid = fork();
+    if (!pid) {
+        close(1);
+        open("/dev/null", O_WRONLY);
+        execl(python, python, ANALYZE_SCRIPT, "-f", file, NULL);
+        g_assert_not_reached();
+    }
+
+    g_assert(waitpid(pid, &wstatus, 0) == pid);
+    if (WIFEXITED(wstatus) && WEXITSTATUS(wstatus) != 0) {
+        g_test_message("Failed to analyze the migration stream");
+        g_test_fail();
+    }
+    test_migrate_end(from, to, false);
+    cleanup("migfile");
+}
+#endif
+
 static void test_precopy_common(MigrateCommon *args)
 {
     QTestState *from, *to;
@@ -2837,6 +2894,9 @@ int main(int argc, char **argv)
     }
 
     qtest_add_func("/migration/bad_dest", test_baddest);
+#ifndef _WIN32
+    qtest_add_func("/migration/analyze-script", test_analyze_script);
+#endif
     qtest_add_func("/migration/precopy/unix/plain", test_precopy_unix_plain);
     qtest_add_func("/migration/precopy/unix/xbzrle", test_precopy_unix_xbzrle);
     /*
diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
index 66795cfcd2..d6022ebd64 100644
--- a/tests/qtest/meson.build
+++ b/tests/qtest/meson.build
@@ -357,6 +357,8 @@ foreach dir : target_dirs
     test_deps += [qsd]
   endif
 
+  qtest_env.set('PYTHON', python.full_path())
+
   foreach test : target_qtests
     # Executables are shared across targets, declare them only the first time we
     # encounter them
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PULL 00/38] Migration 20231017 patches
@ 2023-10-17  8:29 Juan Quintela
  2023-10-17  8:29 ` [PULL 01/38] migration: refactor migration_completion Juan Quintela
                   ` (38 more replies)
  0 siblings, 39 replies; 49+ messages in thread
From: Juan Quintela @ 2023-10-17  8:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Thomas Huth, Fabiano Rosas, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Juan Quintela, Laurent Vivier

The following changes since commit 800485762e6564e04e2ab315132d477069562d91:

  Merge tag 'python-pull-request' of https://gitlab.com/jsnow/qemu into staging (2023-10-16 12:37:48 -0400)

are available in the Git repository at:

  https://gitlab.com/juan.quintela/qemu.git tags/migration-20231017-pull-request

for you to fetch changes up to 967e3889874b1116090a60c0cb43157130bdbd16:

  migration/multifd: Clarify Error usage in multifd_channel_connect (2023-10-17 09:25:14 +0200)

----------------------------------------------------------------
Migration Pull request (20231017)

Hi

Same that yesterday one, except:
- rebased to latest (clean rebase)
- fixed 64 bits read on big endian host

CI: https://gitlab.com/juan.quintela/qemu/-/pipelines/1039214198

Please, apply.

----------------------------------------------------------------

Dmitry Frolov (1):
  migration: fix RAMBlock add NULL check

Elena Ufimtseva (3):
  migration: check for rate_limit_max for RATE_LIMIT_DISABLED
  multifd: fix counters in multifd_send_thread
  multifd: reset next_packet_len after sending pages

Fabiano Rosas (13):
  migration: Fix analyze-migration.py 'configuration' parsing
  migration: Add capability parsing to analyze-migration.py
  migration: Fix analyze-migration.py when ignore-shared is used
  migration: Fix analyze-migration read operation signedness
  tests/qtest/migration: Add a test for the analyze-migration script
  tests/qtest: migration-test: Add tests for file-based migration
  migration/ram: Remove RAMState from xbzrle_cache_zero_page
  migration/ram: Stop passing QEMUFile around in save_zero_page
  migration/ram: Move xbzrle zero page handling into save_zero_page
  migration/ram: Merge save_zero_page functions
  migration/multifd: Remove direct "socket" references
  migration/multifd: Unify multifd_send_thread error paths
  migration/multifd: Clarify Error usage in multifd_channel_connect

Fiona Ebner (1):
  migration: hold the BQL during setup

Juan Quintela (15):
  migration: Non multifd migration don't care about multifd flushes
  migration: Create migrate_rdma()
  migration/rdma: Unfold ram_control_before_iterate()
  migration/rdma: Unfold ram_control_after_iterate()
  migration/rdma: Remove all uses of RAM_CONTROL_HOOK
  migration/rdma: Unfold hook_ram_load()
  migration/rdma: Create rdma_control_save_page()
  qemu-file: Remove QEMUFileHooks
  migration/rdma: Move rdma constants from qemu-file.h to rdma.h
  migration/rdma: Remove qemu_ prefix from exported functions
  migration/rdma: Check sooner if we are in postcopy for save_page()
  migration/rdma: Use i as for index instead of idx
  migration/rdma: Declare for index variables local
  migration/rdma: Remove all "ret" variables that are used only once
  migration: Improve json and formatting

Nikolay Borisov (2):
  migration: Add the configuration vmstate to the json writer
  migration/ram: Refactor precopy ram loading code

Peter Xu (1):
  migration: Allow user to specify available switchover bandwidth

Philippe Mathieu-Daudé (1):
  migration: Use g_autofree to simplify ram_dirty_bitmap_reload()

Wei Wang (1):
  migration: refactor migration_completion

 qapi/migration.json            |  41 ++++-
 include/migration/register.h   |   2 +-
 migration/migration.h          |   4 +-
 migration/options.h            |   2 +
 migration/qemu-file.h          |  49 ------
 migration/rdma.h               |  42 +++++
 migration/block-dirty-bitmap.c |   3 -
 migration/block.c              |   5 -
 migration/migration-hmp-cmds.c |  14 ++
 migration/migration-stats.c    |   9 +-
 migration/migration.c          | 199 +++++++++++++--------
 migration/multifd.c            | 101 +++++------
 migration/options.c            |  35 ++++
 migration/qemu-file.c          |  61 +------
 migration/ram.c                | 306 ++++++++++++++++++---------------
 migration/rdma.c               | 259 ++++++++++++----------------
 migration/savevm.c             |  22 ++-
 tests/qtest/migration-test.c   | 207 ++++++++++++++++++++++
 migration/trace-events         |  33 ++--
 scripts/analyze-migration.py   |  67 +++++++-
 tests/qtest/meson.build        |   2 +
 21 files changed, 895 insertions(+), 568 deletions(-)

-- 
2.41.0



^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PULL 01/38] migration: refactor migration_completion
  2023-10-17  8:29 [PULL 00/38] Migration 20231017 patches Juan Quintela
@ 2023-10-17  8:29 ` Juan Quintela
  2023-10-17  8:29 ` [PULL 02/38] migration: Use g_autofree to simplify ram_dirty_bitmap_reload() Juan Quintela
                   ` (37 subsequent siblings)
  38 siblings, 0 replies; 49+ messages in thread
From: Juan Quintela @ 2023-10-17  8:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Thomas Huth, Fabiano Rosas, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Juan Quintela, Laurent Vivier, Wei Wang,
	Isaku Yamahata

From: Wei Wang <wei.w.wang@intel.com>

Current migration_completion function is a bit long. Refactor the long
implementation into different subfunctions:
- migration_completion_precopy: completion code related to precopy
- migration_completion_postcopy: completion code related to postcopy

Rename await_return_path_close_on_source to
close_return_path_on_source: It is renamed to match with
open_return_path_on_source.

This improves readability and is easier for future updates (e.g. add new
subfunctions when completion code related to new features are needed). No
functional changes intended.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Isaku Yamahata <isaku.yamahata@intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230804093053.5037-1-wei.w.wang@intel.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration/migration.c | 167 ++++++++++++++++++++++++------------------
 1 file changed, 94 insertions(+), 73 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 1c6c81ad49..0e1002d017 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -99,7 +99,7 @@ static int migration_maybe_pause(MigrationState *s,
                                  int *current_active_state,
                                  int new_state);
 static void migrate_fd_cancel(MigrationState *s);
-static int await_return_path_close_on_source(MigrationState *s);
+static int close_return_path_on_source(MigrationState *s);
 
 static bool migration_needs_multiple_sockets(void)
 {
@@ -1191,7 +1191,7 @@ static void migrate_fd_cleanup(MigrationState *s)
      * We already cleaned up to_dst_file, so errors from the return
      * path might be due to that, ignore them.
      */
-    await_return_path_close_on_source(s);
+    close_return_path_on_source(s);
 
     assert(!migration_is_active(s));
 
@@ -2049,8 +2049,7 @@ static int open_return_path_on_source(MigrationState *ms)
     return 0;
 }
 
-/* Returns 0 if the RP was ok, otherwise there was an error on the RP */
-static int await_return_path_close_on_source(MigrationState *ms)
+static int close_return_path_on_source(MigrationState *ms)
 {
     int ret;
 
@@ -2317,6 +2316,87 @@ static int migration_maybe_pause(MigrationState *s,
     return s->state == new_state ? 0 : -EINVAL;
 }
 
+static int migration_completion_precopy(MigrationState *s,
+                                        int *current_active_state)
+{
+    int ret;
+
+    qemu_mutex_lock_iothread();
+    s->downtime_start = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
+    qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
+
+    s->vm_old_state = runstate_get();
+    global_state_store();
+
+    ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
+    trace_migration_completion_vm_stop(ret);
+    if (ret < 0) {
+        goto out_unlock;
+    }
+
+    ret = migration_maybe_pause(s, current_active_state,
+                                MIGRATION_STATUS_DEVICE);
+    if (ret < 0) {
+        goto out_unlock;
+    }
+
+    /*
+     * Inactivate disks except in COLO, and track that we have done so in order
+     * to remember to reactivate them if migration fails or is cancelled.
+     */
+    s->block_inactive = !migrate_colo();
+    migration_rate_set(RATE_LIMIT_DISABLED);
+    ret = qemu_savevm_state_complete_precopy(s->to_dst_file, false,
+                                             s->block_inactive);
+out_unlock:
+    qemu_mutex_unlock_iothread();
+    return ret;
+}
+
+static void migration_completion_postcopy(MigrationState *s)
+{
+    trace_migration_completion_postcopy_end();
+
+    qemu_mutex_lock_iothread();
+    qemu_savevm_state_complete_postcopy(s->to_dst_file);
+    qemu_mutex_unlock_iothread();
+
+    /*
+     * Shutdown the postcopy fast path thread.  This is only needed when dest
+     * QEMU binary is old (7.1/7.2).  QEMU 8.0+ doesn't need this.
+     */
+    if (migrate_postcopy_preempt() && s->preempt_pre_7_2) {
+        postcopy_preempt_shutdown_file(s);
+    }
+
+    trace_migration_completion_postcopy_end_after_complete();
+}
+
+static void migration_completion_failed(MigrationState *s,
+                                        int current_active_state)
+{
+    if (s->block_inactive && (s->state == MIGRATION_STATUS_ACTIVE ||
+                              s->state == MIGRATION_STATUS_DEVICE)) {
+        /*
+         * If not doing postcopy, vm_start() will be called: let's
+         * regain control on images.
+         */
+        Error *local_err = NULL;
+
+        qemu_mutex_lock_iothread();
+        bdrv_activate_all(&local_err);
+        if (local_err) {
+            error_report_err(local_err);
+        } else {
+            s->block_inactive = false;
+        }
+        qemu_mutex_unlock_iothread();
+    }
+
+    migrate_set_state(&s->state, current_active_state,
+                      MIGRATION_STATUS_FAILED);
+}
+
 /**
  * migration_completion: Used by migration_thread when there's not much left.
  *   The caller 'breaks' the loop when this returns.
@@ -2325,62 +2405,22 @@ static int migration_maybe_pause(MigrationState *s,
  */
 static void migration_completion(MigrationState *s)
 {
-    int ret;
+    int ret = 0;
     int current_active_state = s->state;
 
     if (s->state == MIGRATION_STATUS_ACTIVE) {
-        qemu_mutex_lock_iothread();
-        s->downtime_start = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
-        qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
-
-        s->vm_old_state = runstate_get();
-        global_state_store();
-
-        ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
-        trace_migration_completion_vm_stop(ret);
-        if (ret >= 0) {
-            ret = migration_maybe_pause(s, &current_active_state,
-                                        MIGRATION_STATUS_DEVICE);
-        }
-        if (ret >= 0) {
-            /*
-             * Inactivate disks except in COLO, and track that we
-             * have done so in order to remember to reactivate
-             * them if migration fails or is cancelled.
-             */
-            s->block_inactive = !migrate_colo();
-            migration_rate_set(RATE_LIMIT_DISABLED);
-            ret = qemu_savevm_state_complete_precopy(s->to_dst_file, false,
-                                                     s->block_inactive);
-        }
-
-        qemu_mutex_unlock_iothread();
-
-        if (ret < 0) {
-            goto fail;
-        }
+        ret = migration_completion_precopy(s, &current_active_state);
     } else if (s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
-        trace_migration_completion_postcopy_end();
-
-        qemu_mutex_lock_iothread();
-        qemu_savevm_state_complete_postcopy(s->to_dst_file);
-        qemu_mutex_unlock_iothread();
-
-        /*
-         * Shutdown the postcopy fast path thread.  This is only needed
-         * when dest QEMU binary is old (7.1/7.2).  QEMU 8.0+ doesn't need
-         * this.
-         */
-        if (migrate_postcopy_preempt() && s->preempt_pre_7_2) {
-            postcopy_preempt_shutdown_file(s);
-        }
-
-        trace_migration_completion_postcopy_end_after_complete();
+        migration_completion_postcopy(s);
     } else {
+        ret = -1;
+    }
+
+    if (ret < 0) {
         goto fail;
     }
 
-    if (await_return_path_close_on_source(s)) {
+    if (close_return_path_on_source(s)) {
         goto fail;
     }
 
@@ -2401,26 +2441,7 @@ static void migration_completion(MigrationState *s)
     return;
 
 fail:
-    if (s->block_inactive && (s->state == MIGRATION_STATUS_ACTIVE ||
-                              s->state == MIGRATION_STATUS_DEVICE)) {
-        /*
-         * If not doing postcopy, vm_start() will be called: let's
-         * regain control on images.
-         */
-        Error *local_err = NULL;
-
-        qemu_mutex_lock_iothread();
-        bdrv_activate_all(&local_err);
-        if (local_err) {
-            error_report_err(local_err);
-        } else {
-            s->block_inactive = false;
-        }
-        qemu_mutex_unlock_iothread();
-    }
-
-    migrate_set_state(&s->state, current_active_state,
-                      MIGRATION_STATUS_FAILED);
+    migration_completion_failed(s, current_active_state);
 }
 
 /**
@@ -2563,7 +2584,7 @@ static MigThrError postcopy_pause(MigrationState *s)
          * path and just wait for the thread to finish. It will be
          * re-created when we resume.
          */
-        await_return_path_close_on_source(s);
+        close_return_path_on_source(s);
 
         migrate_set_state(&s->state, s->state,
                           MIGRATION_STATUS_POSTCOPY_PAUSED);
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PULL 02/38] migration: Use g_autofree to simplify ram_dirty_bitmap_reload()
  2023-10-17  8:29 [PULL 00/38] Migration 20231017 patches Juan Quintela
  2023-10-17  8:29 ` [PULL 01/38] migration: refactor migration_completion Juan Quintela
@ 2023-10-17  8:29 ` Juan Quintela
  2023-10-17  8:29 ` [PULL 03/38] migration: Allow user to specify available switchover bandwidth Juan Quintela
                   ` (36 subsequent siblings)
  38 siblings, 0 replies; 49+ messages in thread
From: Juan Quintela @ 2023-10-17  8:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Thomas Huth, Fabiano Rosas, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Juan Quintela, Laurent Vivier,
	Philippe Mathieu-Daudé

From: Philippe Mathieu-Daudé <philmd@linaro.org>

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231011023627.86691-1-philmd@linaro.org>
---
 migration/ram.c | 17 ++++++-----------
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 2f5ce4d60b..24d91de8b3 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -4159,7 +4159,8 @@ int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *block)
     int ret = -EINVAL;
     /* from_dst_file is always valid because we're within rp_thread */
     QEMUFile *file = s->rp_state.from_dst_file;
-    unsigned long *le_bitmap, nbits = block->used_length >> TARGET_PAGE_BITS;
+    g_autofree unsigned long *le_bitmap = NULL;
+    unsigned long nbits = block->used_length >> TARGET_PAGE_BITS;
     uint64_t local_size = DIV_ROUND_UP(nbits, 8);
     uint64_t size, end_mark;
     RAMState *rs = ram_state;
@@ -4188,8 +4189,7 @@ int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *block)
         error_report("%s: ramblock '%s' bitmap size mismatch "
                      "(0x%"PRIx64" != 0x%"PRIx64")", __func__,
                      block->idstr, size, local_size);
-        ret = -EINVAL;
-        goto out;
+        return -EINVAL;
     }
 
     size = qemu_get_buffer(file, (uint8_t *)le_bitmap, local_size);
@@ -4200,15 +4200,13 @@ int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *block)
         error_report("%s: read bitmap failed for ramblock '%s': %d"
                      " (size 0x%"PRIx64", got: 0x%"PRIx64")",
                      __func__, block->idstr, ret, local_size, size);
-        ret = -EIO;
-        goto out;
+        return -EIO;
     }
 
     if (end_mark != RAMBLOCK_RECV_BITMAP_ENDING) {
         error_report("%s: ramblock '%s' end mark incorrect: 0x%"PRIx64,
                      __func__, block->idstr, end_mark);
-        ret = -EINVAL;
-        goto out;
+        return -EINVAL;
     }
 
     /*
@@ -4240,10 +4238,7 @@ int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *block)
      */
     migration_rp_kick(s);
 
-    ret = 0;
-out:
-    g_free(le_bitmap);
-    return ret;
+    return 0;
 }
 
 static int ram_resume_prepare(MigrationState *s, void *opaque)
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PULL 03/38] migration: Allow user to specify available switchover bandwidth
  2023-10-17  8:29 [PULL 00/38] Migration 20231017 patches Juan Quintela
  2023-10-17  8:29 ` [PULL 01/38] migration: refactor migration_completion Juan Quintela
  2023-10-17  8:29 ` [PULL 02/38] migration: Use g_autofree to simplify ram_dirty_bitmap_reload() Juan Quintela
@ 2023-10-17  8:29 ` Juan Quintela
  2023-10-17  8:29 ` [PULL 04/38] migration: fix RAMBlock add NULL check Juan Quintela
                   ` (35 subsequent siblings)
  38 siblings, 0 replies; 49+ messages in thread
From: Juan Quintela @ 2023-10-17  8:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Thomas Huth, Fabiano Rosas, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Juan Quintela, Laurent Vivier, Zhiyi Guo,
	Joao Martins

From: Peter Xu <peterx@redhat.com>

Migration bandwidth is a very important value to live migration.  It's
because it's one of the major factors that we'll make decision on when to
switchover to destination in a precopy process.

This value is currently estimated by QEMU during the whole live migration
process by monitoring how fast we were sending the data.  This can be the
most accurate bandwidth if in the ideal world, where we're always feeding
unlimited data to the migration channel, and then it'll be limited to the
bandwidth that is available.

However in reality it may be very different, e.g., over a 10Gbps network we
can see query-migrate showing migration bandwidth of only a few tens of
MB/s just because there are plenty of other things the migration thread
might be doing.  For example, the migration thread can be busy scanning
zero pages, or it can be fetching dirty bitmap from other external dirty
sources (like vhost or KVM).  It means we may not be pushing data as much
as possible to migration channel, so the bandwidth estimated from "how many
data we sent in the channel" can be dramatically inaccurate sometimes.

With that, the decision to switchover will be affected, by assuming that we
may not be able to switchover at all with such a low bandwidth, but in
reality we can.

The migration may not even converge at all with the downtime specified,
with that wrong estimation of bandwidth, keeping iterations forever with a
low estimation of bandwidth.

The issue is QEMU itself may not be able to avoid those uncertainties on
measuing the real "available migration bandwidth".  At least not something
I can think of so far.

One way to fix this is when the user is fully aware of the available
bandwidth, then we can allow the user to help providing an accurate value.

For example, if the user has a dedicated channel of 10Gbps for migration
for this specific VM, the user can specify this bandwidth so QEMU can
always do the calculation based on this fact, trusting the user as long as
specified.  It may not be the exact bandwidth when switching over (in which
case qemu will push migration data as fast as possible), but much better
than QEMU trying to wildly guess, especially when very wrong.

A new parameter "avail-switchover-bandwidth" is introduced just for this.
So when the user specified this parameter, instead of trusting the
estimated value from QEMU itself (based on the QEMUFile send speed), it
trusts the user more by using this value to decide when to switchover,
assuming that we'll have such bandwidth available then.

Note that specifying this value will not throttle the bandwidth for
switchover yet, so QEMU will always use the full bandwidth possible for
sending switchover data, assuming that should always be the most important
way to use the network at that time.

This can resolve issues like "unconvergence migration" which is caused by
hilarious low "migration bandwidth" detected for whatever reason.

Reported-by: Zhiyi Guo <zhguo@redhat.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231010221922.40638-1-peterx@redhat.com>
---
 qapi/migration.json            | 34 +++++++++++++++++++++++++++++++++-
 migration/migration.h          |  2 +-
 migration/options.h            |  1 +
 migration/migration-hmp-cmds.c | 14 ++++++++++++++
 migration/migration.c          | 24 +++++++++++++++++++++---
 migration/options.c            | 28 ++++++++++++++++++++++++++++
 migration/trace-events         |  2 +-
 7 files changed, 99 insertions(+), 6 deletions(-)

diff --git a/qapi/migration.json b/qapi/migration.json
index d7dfaa5db9..360e609f66 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -758,6 +758,16 @@
 # @max-bandwidth: to set maximum speed for migration.  maximum speed
 #     in bytes per second.  (Since 2.8)
 #
+# @avail-switchover-bandwidth: to set the available bandwidth that
+#     migration can use during switchover phase.  NOTE!  This does not
+#     limit the bandwidth during switchover, but only for calculations when
+#     making decisions to switchover.  By default, this value is zero,
+#     which means QEMU will estimate the bandwidth automatically.  This can
+#     be set when the estimated value is not accurate, while the user is
+#     able to guarantee such bandwidth is available when switching over.
+#     When specified correctly, this can make the switchover decision much
+#     more accurate.  (Since 8.2)
+#
 # @downtime-limit: set maximum tolerated downtime for migration.
 #     maximum downtime in milliseconds (Since 2.8)
 #
@@ -839,7 +849,7 @@
            'cpu-throttle-initial', 'cpu-throttle-increment',
            'cpu-throttle-tailslow',
            'tls-creds', 'tls-hostname', 'tls-authz', 'max-bandwidth',
-           'downtime-limit',
+           'avail-switchover-bandwidth', 'downtime-limit',
            { 'name': 'x-checkpoint-delay', 'features': [ 'unstable' ] },
            'block-incremental',
            'multifd-channels',
@@ -924,6 +934,16 @@
 # @max-bandwidth: to set maximum speed for migration.  maximum speed
 #     in bytes per second.  (Since 2.8)
 #
+# @avail-switchover-bandwidth: to set the available bandwidth that
+#     migration can use during switchover phase.  NOTE!  This does not
+#     limit the bandwidth during switchover, but only for calculations when
+#     making decisions to switchover.  By default, this value is zero,
+#     which means QEMU will estimate the bandwidth automatically.  This can
+#     be set when the estimated value is not accurate, while the user is
+#     able to guarantee such bandwidth is available when switching over.
+#     When specified correctly, this can make the switchover decision much
+#     more accurate.  (Since 8.2)
+#
 # @downtime-limit: set maximum tolerated downtime for migration.
 #     maximum downtime in milliseconds (Since 2.8)
 #
@@ -1017,6 +1037,7 @@
             '*tls-hostname': 'StrOrNull',
             '*tls-authz': 'StrOrNull',
             '*max-bandwidth': 'size',
+            '*avail-switchover-bandwidth': 'size',
             '*downtime-limit': 'uint64',
             '*x-checkpoint-delay': { 'type': 'uint32',
                                      'features': [ 'unstable' ] },
@@ -1127,6 +1148,16 @@
 # @max-bandwidth: to set maximum speed for migration.  maximum speed
 #     in bytes per second.  (Since 2.8)
 #
+# @avail-switchover-bandwidth: to set the available bandwidth that
+#     migration can use during switchover phase.  NOTE!  This does not
+#     limit the bandwidth during switchover, but only for calculations when
+#     making decisions to switchover.  By default, this value is zero,
+#     which means QEMU will estimate the bandwidth automatically.  This can
+#     be set when the estimated value is not accurate, while the user is
+#     able to guarantee such bandwidth is available when switching over.
+#     When specified correctly, this can make the switchover decision much
+#     more accurate.  (Since 8.2)
+#
 # @downtime-limit: set maximum tolerated downtime for migration.
 #     maximum downtime in milliseconds (Since 2.8)
 #
@@ -1217,6 +1248,7 @@
             '*tls-hostname': 'str',
             '*tls-authz': 'str',
             '*max-bandwidth': 'size',
+            '*avail-switchover-bandwidth': 'size',
             '*downtime-limit': 'uint64',
             '*x-checkpoint-delay': { 'type': 'uint32',
                                      'features': [ 'unstable' ] },
diff --git a/migration/migration.h b/migration/migration.h
index cd5534337c..974897a8d0 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -294,7 +294,7 @@ struct MigrationState {
     /*
      * The final stage happens when the remaining data is smaller than
      * this threshold; it's calculated from the requested downtime and
-     * measured bandwidth
+     * measured bandwidth, or avail-switchover-bandwidth if specified.
      */
     int64_t threshold_size;
 
diff --git a/migration/options.h b/migration/options.h
index 045e2a41a2..93ee938ab8 100644
--- a/migration/options.h
+++ b/migration/options.h
@@ -80,6 +80,7 @@ int migrate_decompress_threads(void);
 uint64_t migrate_downtime_limit(void);
 uint8_t migrate_max_cpu_throttle(void);
 uint64_t migrate_max_bandwidth(void);
+uint64_t migrate_avail_switchover_bandwidth(void);
 uint64_t migrate_max_postcopy_bandwidth(void);
 int migrate_multifd_channels(void);
 MultiFDCompression migrate_multifd_compression(void);
diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c
index 5b25ba24f7..d206700a43 100644
--- a/migration/migration-hmp-cmds.c
+++ b/migration/migration-hmp-cmds.c
@@ -321,6 +321,10 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict)
         monitor_printf(mon, "%s: %" PRIu64 " bytes/second\n",
             MigrationParameter_str(MIGRATION_PARAMETER_MAX_BANDWIDTH),
             params->max_bandwidth);
+        assert(params->has_avail_switchover_bandwidth);
+        monitor_printf(mon, "%s: %" PRIu64 " bytes/second\n",
+            MigrationParameter_str(MIGRATION_PARAMETER_AVAIL_SWITCHOVER_BANDWIDTH),
+            params->avail_switchover_bandwidth);
         assert(params->has_downtime_limit);
         monitor_printf(mon, "%s: %" PRIu64 " ms\n",
             MigrationParameter_str(MIGRATION_PARAMETER_DOWNTIME_LIMIT),
@@ -574,6 +578,16 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict)
         }
         p->max_bandwidth = valuebw;
         break;
+    case MIGRATION_PARAMETER_AVAIL_SWITCHOVER_BANDWIDTH:
+        p->has_avail_switchover_bandwidth = true;
+        ret = qemu_strtosz_MiB(valuestr, NULL, &valuebw);
+        if (ret < 0 || valuebw > INT64_MAX
+            || (size_t)valuebw != valuebw) {
+            error_setg(&err, "Invalid size %s", valuestr);
+            break;
+        }
+        p->avail_switchover_bandwidth = valuebw;
+        break;
     case MIGRATION_PARAMETER_DOWNTIME_LIMIT:
         p->has_downtime_limit = true;
         visit_type_size(v, param, &p->downtime_limit, &err);
diff --git a/migration/migration.c b/migration/migration.c
index 0e1002d017..ed04ca3b1c 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2710,17 +2710,33 @@ static void migration_update_counters(MigrationState *s,
 {
     uint64_t transferred, transferred_pages, time_spent;
     uint64_t current_bytes; /* bytes transferred since the beginning */
+    uint64_t switchover_bw;
+    /* Expected bandwidth when switching over to destination QEMU */
+    double expected_bw_per_ms;
     double bandwidth;
 
     if (current_time < s->iteration_start_time + BUFFER_DELAY) {
         return;
     }
 
+    switchover_bw = migrate_avail_switchover_bandwidth();
     current_bytes = migration_transferred_bytes(s->to_dst_file);
     transferred = current_bytes - s->iteration_initial_bytes;
     time_spent = current_time - s->iteration_start_time;
     bandwidth = (double)transferred / time_spent;
-    s->threshold_size = bandwidth * migrate_downtime_limit();
+
+    if (switchover_bw) {
+        /*
+         * If the user specified a switchover bandwidth, let's trust the
+         * user so that can be more accurate than what we estimated.
+         */
+        expected_bw_per_ms = switchover_bw / 1000;
+    } else {
+        /* If the user doesn't specify bandwidth, we use the estimated */
+        expected_bw_per_ms = bandwidth;
+    }
+
+    s->threshold_size = expected_bw_per_ms * migrate_downtime_limit();
 
     s->mbps = (((double) transferred * 8.0) /
                ((double) time_spent / 1000.0)) / 1000.0 / 1000.0;
@@ -2737,7 +2753,7 @@ static void migration_update_counters(MigrationState *s,
     if (stat64_get(&mig_stats.dirty_pages_rate) &&
         transferred > 10000) {
         s->expected_downtime =
-            stat64_get(&mig_stats.dirty_bytes_last_sync) / bandwidth;
+            stat64_get(&mig_stats.dirty_bytes_last_sync) / expected_bw_per_ms;
     }
 
     migration_rate_reset(s->to_dst_file);
@@ -2745,7 +2761,9 @@ static void migration_update_counters(MigrationState *s,
     update_iteration_initial_status(s);
 
     trace_migrate_transferred(transferred, time_spent,
-                              bandwidth, s->threshold_size);
+                              /* Both in unit bytes/ms */
+                              bandwidth, switchover_bw / 1000,
+                              s->threshold_size);
 }
 
 static bool migration_can_switchover(MigrationState *s)
diff --git a/migration/options.c b/migration/options.c
index 6bbfd4853d..546cbe3106 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -125,6 +125,8 @@ Property migration_properties[] = {
                       parameters.cpu_throttle_tailslow, false),
     DEFINE_PROP_SIZE("x-max-bandwidth", MigrationState,
                       parameters.max_bandwidth, MAX_THROTTLE),
+    DEFINE_PROP_SIZE("avail-switchover-bandwidth", MigrationState,
+                      parameters.avail_switchover_bandwidth, 0),
     DEFINE_PROP_UINT64("x-downtime-limit", MigrationState,
                       parameters.downtime_limit,
                       DEFAULT_MIGRATE_SET_DOWNTIME),
@@ -780,6 +782,13 @@ uint64_t migrate_max_bandwidth(void)
     return s->parameters.max_bandwidth;
 }
 
+uint64_t migrate_avail_switchover_bandwidth(void)
+{
+    MigrationState *s = migrate_get_current();
+
+    return s->parameters.avail_switchover_bandwidth;
+}
+
 uint64_t migrate_max_postcopy_bandwidth(void)
 {
     MigrationState *s = migrate_get_current();
@@ -917,6 +926,8 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp)
                                  s->parameters.tls_authz : "");
     params->has_max_bandwidth = true;
     params->max_bandwidth = s->parameters.max_bandwidth;
+    params->has_avail_switchover_bandwidth = true;
+    params->avail_switchover_bandwidth = s->parameters.avail_switchover_bandwidth;
     params->has_downtime_limit = true;
     params->downtime_limit = s->parameters.downtime_limit;
     params->has_x_checkpoint_delay = true;
@@ -1056,6 +1067,15 @@ bool migrate_params_check(MigrationParameters *params, Error **errp)
         return false;
     }
 
+    if (params->has_avail_switchover_bandwidth &&
+        (params->avail_switchover_bandwidth > SIZE_MAX)) {
+        error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
+                   "avail_switchover_bandwidth",
+                   "an integer in the range of 0 to "stringify(SIZE_MAX)
+                   " bytes/second");
+        return false;
+    }
+
     if (params->has_downtime_limit &&
         (params->downtime_limit > MAX_MIGRATE_DOWNTIME)) {
         error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
@@ -1225,6 +1245,10 @@ static void migrate_params_test_apply(MigrateSetParameters *params,
         dest->max_bandwidth = params->max_bandwidth;
     }
 
+    if (params->has_avail_switchover_bandwidth) {
+        dest->avail_switchover_bandwidth = params->avail_switchover_bandwidth;
+    }
+
     if (params->has_downtime_limit) {
         dest->downtime_limit = params->downtime_limit;
     }
@@ -1341,6 +1365,10 @@ static void migrate_params_apply(MigrateSetParameters *params, Error **errp)
         }
     }
 
+    if (params->has_avail_switchover_bandwidth) {
+        s->parameters.avail_switchover_bandwidth = params->avail_switchover_bandwidth;
+    }
+
     if (params->has_downtime_limit) {
         s->parameters.downtime_limit = params->downtime_limit;
     }
diff --git a/migration/trace-events b/migration/trace-events
index ee9c8f4d63..d8c2aa846d 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -186,7 +186,7 @@ source_return_path_thread_shut(uint32_t val) "0x%x"
 source_return_path_thread_resume_ack(uint32_t v) "%"PRIu32
 source_return_path_thread_switchover_acked(void) ""
 migration_thread_low_pending(uint64_t pending) "%" PRIu64
-migrate_transferred(uint64_t transferred, uint64_t time_spent, uint64_t bandwidth, uint64_t size) "transferred %" PRIu64 " time_spent %" PRIu64 " bandwidth %" PRIu64 " max_size %" PRId64
+migrate_transferred(uint64_t transferred, uint64_t time_spent, uint64_t bandwidth, uint64_t avail_bw, uint64_t size) "transferred %" PRIu64 " time_spent %" PRIu64 " bandwidth %" PRIu64 " switchover_bw %" PRIu64 " max_size %" PRId64
 process_incoming_migration_co_end(int ret, int ps) "ret=%d postcopy-state=%d"
 process_incoming_migration_co_postcopy_end_main(void) ""
 postcopy_preempt_enabled(bool value) "%d"
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PULL 04/38] migration: fix RAMBlock add NULL check
  2023-10-17  8:29 [PULL 00/38] Migration 20231017 patches Juan Quintela
                   ` (2 preceding siblings ...)
  2023-10-17  8:29 ` [PULL 03/38] migration: Allow user to specify available switchover bandwidth Juan Quintela
@ 2023-10-17  8:29 ` Juan Quintela
  2023-10-17  8:29 ` [PULL 05/38] migration: Add the configuration vmstate to the json writer Juan Quintela
                   ` (34 subsequent siblings)
  38 siblings, 0 replies; 49+ messages in thread
From: Juan Quintela @ 2023-10-17  8:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Thomas Huth, Fabiano Rosas, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Juan Quintela, Laurent Vivier,
	Dmitry Frolov

From: Dmitry Frolov <frolov@swemel.ru>

qemu_ram_block_from_host() may return NULL, which will be dereferenced w/o
check. Usualy return value is checked for this function.
Found by Linux Verification Center (linuxtesting.org) with SVACE.

Signed-off-by: Dmitry Frolov <frolov@swemel.ru>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231010104851.802947-1-frolov@swemel.ru>
---
 migration/ram.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/migration/ram.c b/migration/ram.c
index 24d91de8b3..e8df4dc862 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -4285,6 +4285,11 @@ static void ram_mig_ram_block_resized(RAMBlockNotifier *n, void *host,
     RAMBlock *rb = qemu_ram_block_from_host(host, false, &offset);
     Error *err = NULL;
 
+    if (!rb) {
+        error_report("RAM block not found");
+        return;
+    }
+
     if (migrate_ram_is_ignored(rb)) {
         return;
     }
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PULL 05/38] migration: Add the configuration vmstate to the json writer
  2023-10-17  8:29 [PULL 00/38] Migration 20231017 patches Juan Quintela
                   ` (3 preceding siblings ...)
  2023-10-17  8:29 ` [PULL 04/38] migration: fix RAMBlock add NULL check Juan Quintela
@ 2023-10-17  8:29 ` Juan Quintela
  2023-10-17  8:29 ` [PULL 06/38] migration: Fix analyze-migration.py 'configuration' parsing Juan Quintela
                   ` (33 subsequent siblings)
  38 siblings, 0 replies; 49+ messages in thread
From: Juan Quintela @ 2023-10-17  8:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Thomas Huth, Fabiano Rosas, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Juan Quintela, Laurent Vivier,
	Nikolay Borisov

From: Nikolay Borisov <nborisov@suse.com>

Make the migration json writer part of MigrationState struct, allowing
the 'configuration' object be serialized to json.

This will facilitate the parsing of the 'configuration' object in the
next patch that fixes analyze-migration.py for arm.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231009184326.15777-2-farosas@suse.de>
---
 migration/migration.c |  1 +
 migration/savevm.c    | 20 ++++++++++++++++----
 2 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index ed04ca3b1c..98151b1424 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1442,6 +1442,7 @@ int migrate_init(MigrationState *s, Error **errp)
     error_free(s->error);
     s->error = NULL;
     s->hostname = NULL;
+    s->vmdesc = NULL;
 
     migrate_set_state(&s->state, MIGRATION_STATUS_NONE, MIGRATION_STATUS_SETUP);
 
diff --git a/migration/savevm.c b/migration/savevm.c
index 497ce02bd7..bce698b0af 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1217,13 +1217,27 @@ void qemu_savevm_non_migratable_list(strList **reasons)
 
 void qemu_savevm_state_header(QEMUFile *f)
 {
+    MigrationState *s = migrate_get_current();
+
+    s->vmdesc = json_writer_new(false);
+
     trace_savevm_state_header();
     qemu_put_be32(f, QEMU_VM_FILE_MAGIC);
     qemu_put_be32(f, QEMU_VM_FILE_VERSION);
 
-    if (migrate_get_current()->send_configuration) {
+    if (s->send_configuration) {
         qemu_put_byte(f, QEMU_VM_CONFIGURATION);
-        vmstate_save_state(f, &vmstate_configuration, &savevm_state, 0);
+
+        /*
+         * This starts the main json object and is paired with the
+         * json_writer_end_object in
+         * qemu_savevm_state_complete_precopy_non_iterable
+         */
+        json_writer_start_object(s->vmdesc, NULL);
+
+        json_writer_start_object(s->vmdesc, "configuration");
+        vmstate_save_state(f, &vmstate_configuration, &savevm_state, s->vmdesc);
+        json_writer_end_object(s->vmdesc);
     }
 }
 
@@ -1272,8 +1286,6 @@ void qemu_savevm_state_setup(QEMUFile *f)
     Error *local_err = NULL;
     int ret;
 
-    ms->vmdesc = json_writer_new(false);
-    json_writer_start_object(ms->vmdesc, NULL);
     json_writer_int64(ms->vmdesc, "page_size", qemu_target_page_size());
     json_writer_start_array(ms->vmdesc, "devices");
 
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PULL 06/38] migration: Fix analyze-migration.py 'configuration' parsing
  2023-10-17  8:29 [PULL 00/38] Migration 20231017 patches Juan Quintela
                   ` (4 preceding siblings ...)
  2023-10-17  8:29 ` [PULL 05/38] migration: Add the configuration vmstate to the json writer Juan Quintela
@ 2023-10-17  8:29 ` Juan Quintela
  2023-10-17  8:29 ` [PULL 07/38] migration: Add capability parsing to analyze-migration.py Juan Quintela
                   ` (32 subsequent siblings)
  38 siblings, 0 replies; 49+ messages in thread
From: Juan Quintela @ 2023-10-17  8:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Thomas Huth, Fabiano Rosas, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Juan Quintela, Laurent Vivier

From: Fabiano Rosas <farosas@suse.de>

The 'configuration' state subsections are currently not being parsed
and the script fails when analyzing an aarch64 stream:

Traceback (most recent call last):
  File "./scripts/analyze-migration.py", line 625, in <module>
    dump.read(dump_memory = args.memory)
  File "./scripts/analyze-migration.py", line 571, in read
    raise Exception("Unknown section type: %d" % section_type)
Exception: Unknown section type: 5

Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231009184326.15777-3-farosas@suse.de>
---
 scripts/analyze-migration.py | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/scripts/analyze-migration.py b/scripts/analyze-migration.py
index 082424558b..24687db497 100755
--- a/scripts/analyze-migration.py
+++ b/scripts/analyze-migration.py
@@ -261,12 +261,21 @@ def getDict(self):
 
 
 class ConfigurationSection(object):
-    def __init__(self, file):
+    def __init__(self, file, desc):
         self.file = file
+        self.desc = desc
 
     def read(self):
-        name_len = self.file.read32()
-        name = self.file.readstr(len = name_len)
+        if self.desc:
+            version_id = self.desc['version']
+            section = VMSDSection(self.file, version_id, self.desc,
+                                  'configuration')
+            section.read()
+        else:
+            # backward compatibility for older streams that don't have
+            # the configuration section in the json
+            name_len = self.file.read32()
+            name = self.file.readstr(len = name_len)
 
 class VMSDFieldGeneric(object):
     def __init__(self, desc, file):
@@ -532,7 +541,8 @@ def read(self, desc_only = False, dump_memory = False, write_memory = False):
             if section_type == self.QEMU_VM_EOF:
                 break
             elif section_type == self.QEMU_VM_CONFIGURATION:
-                section = ConfigurationSection(file)
+                config_desc = self.vmsd_desc.get('configuration')
+                section = ConfigurationSection(file, config_desc)
                 section.read()
             elif section_type == self.QEMU_VM_SECTION_START or section_type == self.QEMU_VM_SECTION_FULL:
                 section_id = file.read32()
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PULL 07/38] migration: Add capability parsing to analyze-migration.py
  2023-10-17  8:29 [PULL 00/38] Migration 20231017 patches Juan Quintela
                   ` (5 preceding siblings ...)
  2023-10-17  8:29 ` [PULL 06/38] migration: Fix analyze-migration.py 'configuration' parsing Juan Quintela
@ 2023-10-17  8:29 ` Juan Quintela
  2023-10-17  8:29 ` [PULL 08/38] migration: Fix analyze-migration.py when ignore-shared is used Juan Quintela
                   ` (31 subsequent siblings)
  38 siblings, 0 replies; 49+ messages in thread
From: Juan Quintela @ 2023-10-17  8:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Thomas Huth, Fabiano Rosas, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Juan Quintela, Laurent Vivier

From: Fabiano Rosas <farosas@suse.de>

The script is broken when the configuration/capabilities section is
present. Add support for parsing the capabilities so we can fix it in
the next patch.

Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231009184326.15777-4-farosas@suse.de>
---
 scripts/analyze-migration.py | 38 ++++++++++++++++++++++++++++++++++++
 1 file changed, 38 insertions(+)

diff --git a/scripts/analyze-migration.py b/scripts/analyze-migration.py
index 24687db497..c700fed64d 100755
--- a/scripts/analyze-migration.py
+++ b/scripts/analyze-migration.py
@@ -264,6 +264,24 @@ class ConfigurationSection(object):
     def __init__(self, file, desc):
         self.file = file
         self.desc = desc
+        self.caps = []
+
+    def parse_capabilities(self, vmsd_caps):
+        if not vmsd_caps:
+            return
+
+        ncaps = vmsd_caps.data['caps_count'].data
+        self.caps = vmsd_caps.data['capabilities']
+
+        if type(self.caps) != list:
+            self.caps = [self.caps]
+
+        if len(self.caps) != ncaps:
+            raise Exception("Number of capabilities doesn't match "
+                            "caps_count field")
+
+    def has_capability(self, cap):
+        return any([str(c) == cap for c in self.caps])
 
     def read(self):
         if self.desc:
@@ -271,6 +289,8 @@ def read(self):
             section = VMSDSection(self.file, version_id, self.desc,
                                   'configuration')
             section.read()
+            self.parse_capabilities(
+                section.data.get("configuration/capabilities"))
         else:
             # backward compatibility for older streams that don't have
             # the configuration section in the json
@@ -297,6 +317,23 @@ def read(self):
         self.data = self.file.readvar(size)
         return self.data
 
+class VMSDFieldCap(object):
+    def __init__(self, desc, file):
+        self.file = file
+        self.desc = desc
+        self.data = ""
+
+    def __repr__(self):
+        return self.data
+
+    def __str__(self):
+        return self.data
+
+    def read(self):
+        len = self.file.read8()
+        self.data = self.file.readstr(len)
+
+
 class VMSDFieldInt(VMSDFieldGeneric):
     def __init__(self, desc, file):
         super(VMSDFieldInt, self).__init__(desc, file)
@@ -471,6 +508,7 @@ def getDict(self):
     "unused_buffer" : VMSDFieldGeneric,
     "bitmap" : VMSDFieldGeneric,
     "struct" : VMSDFieldStruct,
+    "capability": VMSDFieldCap,
     "unknown" : VMSDFieldGeneric,
 }
 
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PULL 08/38] migration: Fix analyze-migration.py when ignore-shared is used
  2023-10-17  8:29 [PULL 00/38] Migration 20231017 patches Juan Quintela
                   ` (6 preceding siblings ...)
  2023-10-17  8:29 ` [PULL 07/38] migration: Add capability parsing to analyze-migration.py Juan Quintela
@ 2023-10-17  8:29 ` Juan Quintela
  2023-10-17  8:29 ` [PULL 09/38] migration: Fix analyze-migration read operation signedness Juan Quintela
                   ` (30 subsequent siblings)
  38 siblings, 0 replies; 49+ messages in thread
From: Juan Quintela @ 2023-10-17  8:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Thomas Huth, Fabiano Rosas, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Juan Quintela, Laurent Vivier

From: Fabiano Rosas <farosas@suse.de>

The script is currently broken when the x-ignore-shared capability is
used:

Traceback (most recent call last):
  File "./scripts/analyze-migration.py", line 656, in <module>
    dump.read(dump_memory = args.memory)
  File "./scripts/analyze-migration.py", line 593, in read
    section.read()
  File "./scripts/analyze-migration.py", line 163, in read
    self.name = self.file.readstr(len = namelen)
  File "./scripts/analyze-migration.py", line 53, in readstr
    return self.readvar(len).decode('utf-8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x82 in position 55: invalid start byte

We're currently adding data to the middle of the ram section depending
on the presence of the capability. As a consequence, any code loading
the ram section needs to know about capabilities so it can interpret
the stream.

Skip the byte that's added when x-ignore-shared is used to fix the
script.

Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231009184326.15777-5-farosas@suse.de>
---
 scripts/analyze-migration.py | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/scripts/analyze-migration.py b/scripts/analyze-migration.py
index c700fed64d..56ab04dd2d 100755
--- a/scripts/analyze-migration.py
+++ b/scripts/analyze-migration.py
@@ -123,6 +123,7 @@ def __init__(self, file, version_id, ramargs, section_key):
         self.TARGET_PAGE_SIZE = ramargs['page_size']
         self.dump_memory = ramargs['dump_memory']
         self.write_memory = ramargs['write_memory']
+        self.ignore_shared = ramargs['ignore_shared']
         self.sizeinfo = collections.OrderedDict()
         self.data = collections.OrderedDict()
         self.data['section sizes'] = self.sizeinfo
@@ -169,6 +170,8 @@ def read(self):
                         f.truncate(0)
                         f.truncate(len)
                         self.files[self.name] = f
+                    if self.ignore_shared:
+                        mr_addr = self.file.read64()
                 flags &= ~self.RAM_SAVE_FLAG_MEM_SIZE
 
             if flags & self.RAM_SAVE_FLAG_COMPRESS:
@@ -572,6 +575,7 @@ def read(self, desc_only = False, dump_memory = False, write_memory = False):
         ramargs['page_size'] = self.vmsd_desc['page_size']
         ramargs['dump_memory'] = dump_memory
         ramargs['write_memory'] = write_memory
+        ramargs['ignore_shared'] = False
         self.section_classes[('ram',0)][1] = ramargs
 
         while True:
@@ -582,6 +586,7 @@ def read(self, desc_only = False, dump_memory = False, write_memory = False):
                 config_desc = self.vmsd_desc.get('configuration')
                 section = ConfigurationSection(file, config_desc)
                 section.read()
+                ramargs['ignore_shared'] = section.has_capability('x-ignore-shared')
             elif section_type == self.QEMU_VM_SECTION_START or section_type == self.QEMU_VM_SECTION_FULL:
                 section_id = file.read32()
                 name = file.readstr()
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PULL 09/38] migration: Fix analyze-migration read operation signedness
  2023-10-17  8:29 [PULL 00/38] Migration 20231017 patches Juan Quintela
                   ` (7 preceding siblings ...)
  2023-10-17  8:29 ` [PULL 08/38] migration: Fix analyze-migration.py when ignore-shared is used Juan Quintela
@ 2023-10-17  8:29 ` Juan Quintela
  2023-10-17  8:29 ` [PULL 10/38] tests/qtest/migration: Add a test for the analyze-migration script Juan Quintela
                   ` (29 subsequent siblings)
  38 siblings, 0 replies; 49+ messages in thread
From: Juan Quintela @ 2023-10-17  8:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Thomas Huth, Fabiano Rosas, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Juan Quintela, Laurent Vivier

From: Fabiano Rosas <farosas@suse.de>

The migration code uses unsigned values for 16, 32 and 64-bit
operations. Fix the script to do the same.

This was causing an issue when parsing the migration stream generated
on the ppc64 target because one of instance_ids was larger than the
32bit signed maximum:

Traceback (most recent call last):
  File "/home/fabiano/kvm/qemu/build/scripts/analyze-migration.py", line 658, in <module>
    dump.read(dump_memory = args.memory)
  File "/home/fabiano/kvm/qemu/build/scripts/analyze-migration.py", line 592, in read
    classdesc = self.section_classes[section_key]
KeyError: ('spapr_iommu', -2147483648)

Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231009184326.15777-6-farosas@suse.de>
---
 scripts/analyze-migration.py | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/scripts/analyze-migration.py b/scripts/analyze-migration.py
index 56ab04dd2d..de506cb8bf 100755
--- a/scripts/analyze-migration.py
+++ b/scripts/analyze-migration.py
@@ -38,13 +38,13 @@ def __init__(self, filename):
         self.file = open(self.filename, "rb")
 
     def read64(self):
-        return int.from_bytes(self.file.read(8), byteorder='big', signed=True)
+        return int.from_bytes(self.file.read(8), byteorder='big', signed=False)
 
     def read32(self):
-        return int.from_bytes(self.file.read(4), byteorder='big', signed=True)
+        return int.from_bytes(self.file.read(4), byteorder='big', signed=False)
 
     def read16(self):
-        return int.from_bytes(self.file.read(2), byteorder='big', signed=True)
+        return int.from_bytes(self.file.read(2), byteorder='big', signed=False)
 
     def read8(self):
         return int.from_bytes(self.file.read(1), byteorder='big', signed=True)
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PULL 10/38] tests/qtest/migration: Add a test for the analyze-migration script
  2023-10-17  8:29 [PULL 00/38] Migration 20231017 patches Juan Quintela
                   ` (8 preceding siblings ...)
  2023-10-17  8:29 ` [PULL 09/38] migration: Fix analyze-migration read operation signedness Juan Quintela
@ 2023-10-17  8:29 ` Juan Quintela
  2024-05-21 12:24   ` Alex Bennée
  2023-10-17  8:29 ` [PULL 11/38] tests/qtest: migration-test: Add tests for file-based migration Juan Quintela
                   ` (28 subsequent siblings)
  38 siblings, 1 reply; 49+ messages in thread
From: Juan Quintela @ 2023-10-17  8:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Thomas Huth, Fabiano Rosas, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Juan Quintela, Laurent Vivier

From: Fabiano Rosas <farosas@suse.de>

Add a smoke test that migrates to a file and gives it to the
script. It should catch the most annoying errors such as changes in
the ram flags.

After code has been merged it becomes way harder to figure out what is
causing the script to fail, the person making the change is the most
likely to know right away what the problem is.

Signed-off-by: Fabiano Rosas <farosas@suse.de>
Acked-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231009184326.15777-7-farosas@suse.de>
---
 tests/qtest/migration-test.c | 60 ++++++++++++++++++++++++++++++++++++
 tests/qtest/meson.build      |  2 ++
 2 files changed, 62 insertions(+)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 8eb2053dbb..cef5081f8c 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -66,6 +66,8 @@ static bool got_dst_resume;
  */
 #define DIRTYLIMIT_TOLERANCE_RANGE  25  /* MB/s */
 
+#define ANALYZE_SCRIPT "scripts/analyze-migration.py"
+
 #if defined(__linux__)
 #include <sys/syscall.h>
 #include <sys/vfs.h>
@@ -1501,6 +1503,61 @@ static void test_baddest(void)
     test_migrate_end(from, to, false);
 }
 
+#ifndef _WIN32
+static void test_analyze_script(void)
+{
+    MigrateStart args = {
+        .opts_source = "-uuid 11111111-1111-1111-1111-111111111111",
+    };
+    QTestState *from, *to;
+    g_autofree char *uri = NULL;
+    g_autofree char *file = NULL;
+    int pid, wstatus;
+    const char *python = g_getenv("PYTHON");
+
+    if (!python) {
+        g_test_skip("PYTHON variable not set");
+        return;
+    }
+
+    /* dummy url */
+    if (test_migrate_start(&from, &to, "tcp:127.0.0.1:0", &args)) {
+        return;
+    }
+
+    /*
+     * Setting these two capabilities causes the "configuration"
+     * vmstate to include subsections for them. The script needs to
+     * parse those subsections properly.
+     */
+    migrate_set_capability(from, "validate-uuid", true);
+    migrate_set_capability(from, "x-ignore-shared", true);
+
+    file = g_strdup_printf("%s/migfile", tmpfs);
+    uri = g_strdup_printf("exec:cat > %s", file);
+
+    migrate_ensure_converge(from);
+    migrate_qmp(from, uri, "{}");
+    wait_for_migration_complete(from);
+
+    pid = fork();
+    if (!pid) {
+        close(1);
+        open("/dev/null", O_WRONLY);
+        execl(python, python, ANALYZE_SCRIPT, "-f", file, NULL);
+        g_assert_not_reached();
+    }
+
+    g_assert(waitpid(pid, &wstatus, 0) == pid);
+    if (WIFEXITED(wstatus) && WEXITSTATUS(wstatus) != 0) {
+        g_test_message("Failed to analyze the migration stream");
+        g_test_fail();
+    }
+    test_migrate_end(from, to, false);
+    cleanup("migfile");
+}
+#endif
+
 static void test_precopy_common(MigrateCommon *args)
 {
     QTestState *from, *to;
@@ -2837,6 +2894,9 @@ int main(int argc, char **argv)
     }
 
     qtest_add_func("/migration/bad_dest", test_baddest);
+#ifndef _WIN32
+    qtest_add_func("/migration/analyze-script", test_analyze_script);
+#endif
     qtest_add_func("/migration/precopy/unix/plain", test_precopy_unix_plain);
     qtest_add_func("/migration/precopy/unix/xbzrle", test_precopy_unix_xbzrle);
     /*
diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
index 66795cfcd2..d6022ebd64 100644
--- a/tests/qtest/meson.build
+++ b/tests/qtest/meson.build
@@ -357,6 +357,8 @@ foreach dir : target_dirs
     test_deps += [qsd]
   endif
 
+  qtest_env.set('PYTHON', python.full_path())
+
   foreach test : target_qtests
     # Executables are shared across targets, declare them only the first time we
     # encounter them
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PULL 11/38] tests/qtest: migration-test: Add tests for file-based migration
  2023-10-17  8:29 [PULL 00/38] Migration 20231017 patches Juan Quintela
                   ` (9 preceding siblings ...)
  2023-10-17  8:29 ` [PULL 10/38] tests/qtest/migration: Add a test for the analyze-migration script Juan Quintela
@ 2023-10-17  8:29 ` Juan Quintela
  2023-10-17  8:29 ` [PULL 12/38] migration: hold the BQL during setup Juan Quintela
                   ` (27 subsequent siblings)
  38 siblings, 0 replies; 49+ messages in thread
From: Juan Quintela @ 2023-10-17  8:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Thomas Huth, Fabiano Rosas, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Juan Quintela, Laurent Vivier

From: Fabiano Rosas <farosas@suse.de>

Add basic tests for file-based migration.

Note that we cannot use test_precopy_common because that routine
expects it to be possible to run the migration live. With the file
transport there is no live migration because we must wait for the
source to finish writing the migration data to the file before the
destination can start reading. Add a new migration function
specifically to handle the file migration.

Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230712190742.22294-7-farosas@suse.de>
---
 tests/qtest/migration-test.c | 147 +++++++++++++++++++++++++++++++++++
 1 file changed, 147 insertions(+)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index cef5081f8c..e1c110537b 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -68,6 +68,10 @@ static bool got_dst_resume;
 
 #define ANALYZE_SCRIPT "scripts/analyze-migration.py"
 
+#define QEMU_VM_FILE_MAGIC 0x5145564d
+#define FILE_TEST_FILENAME "migfile"
+#define FILE_TEST_OFFSET 0x1000
+
 #if defined(__linux__)
 #include <sys/syscall.h>
 #include <sys/vfs.h>
@@ -884,6 +888,7 @@ static void test_migrate_end(QTestState *from, QTestState *to, bool test_dest)
     cleanup("migsocket");
     cleanup("src_serial");
     cleanup("dest_serial");
+    cleanup(FILE_TEST_FILENAME);
 }
 
 #ifdef CONFIG_GNUTLS
@@ -1667,6 +1672,70 @@ finish:
     test_migrate_end(from, to, args->result == MIG_TEST_SUCCEED);
 }
 
+static void test_file_common(MigrateCommon *args, bool stop_src)
+{
+    QTestState *from, *to;
+    void *data_hook = NULL;
+    g_autofree char *connect_uri = g_strdup(args->connect_uri);
+
+    if (test_migrate_start(&from, &to, args->listen_uri, &args->start)) {
+        return;
+    }
+
+    /*
+     * File migration is never live. We can keep the source VM running
+     * during migration, but the destination will not be running
+     * concurrently.
+     */
+    g_assert_false(args->live);
+
+    if (args->start_hook) {
+        data_hook = args->start_hook(from, to);
+    }
+
+    migrate_ensure_converge(from);
+    wait_for_serial("src_serial");
+
+    if (stop_src) {
+        qtest_qmp_assert_success(from, "{ 'execute' : 'stop'}");
+        if (!got_src_stop) {
+            qtest_qmp_eventwait(from, "STOP");
+        }
+    }
+
+    if (args->result == MIG_TEST_QMP_ERROR) {
+        migrate_qmp_fail(from, connect_uri, "{}");
+        goto finish;
+    }
+
+    migrate_qmp(from, connect_uri, "{}");
+    wait_for_migration_complete(from);
+
+    /*
+     * We need to wait for the source to finish before starting the
+     * destination.
+     */
+    migrate_incoming_qmp(to, connect_uri, "{}");
+    wait_for_migration_complete(to);
+
+    if (stop_src) {
+        qtest_qmp_assert_success(to, "{ 'execute' : 'cont'}");
+    }
+
+    if (!got_dst_resume) {
+        qtest_qmp_eventwait(to, "RESUME");
+    }
+
+    wait_for_serial("dest_serial");
+
+finish:
+    if (args->finish_hook) {
+        args->finish_hook(from, to, data_hook);
+    }
+
+    test_migrate_end(from, to, args->result == MIG_TEST_SUCCEED);
+}
+
 static void test_precopy_unix_plain(void)
 {
     g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
@@ -1862,6 +1931,76 @@ static void test_precopy_unix_compress_nowait(void)
     test_precopy_common(&args);
 }
 
+static void test_precopy_file(void)
+{
+    g_autofree char *uri = g_strdup_printf("file:%s/%s", tmpfs,
+                                           FILE_TEST_FILENAME);
+    MigrateCommon args = {
+        .connect_uri = uri,
+        .listen_uri = "defer",
+    };
+
+    test_file_common(&args, true);
+}
+
+static void file_offset_finish_hook(QTestState *from, QTestState *to,
+                                    void *opaque)
+{
+#if defined(__linux__)
+    g_autofree char *path = g_strdup_printf("%s/%s", tmpfs, FILE_TEST_FILENAME);
+    size_t size = FILE_TEST_OFFSET + sizeof(QEMU_VM_FILE_MAGIC);
+    uintptr_t *addr, *p;
+    int fd;
+
+    fd = open(path, O_RDONLY);
+    g_assert(fd != -1);
+    addr = mmap(NULL, size, PROT_READ, MAP_SHARED, fd, 0);
+    g_assert(addr != MAP_FAILED);
+
+    /*
+     * Ensure the skipped offset contains zeros and the migration
+     * stream starts at the right place.
+     */
+    p = addr;
+    while (p < addr + FILE_TEST_OFFSET / sizeof(uintptr_t)) {
+        g_assert(*p == 0);
+        p++;
+    }
+    g_assert_cmpint(cpu_to_be64(*p) >> 32, ==, QEMU_VM_FILE_MAGIC);
+
+    munmap(addr, size);
+    close(fd);
+#endif
+}
+
+static void test_precopy_file_offset(void)
+{
+    g_autofree char *uri = g_strdup_printf("file:%s/%s,offset=%d", tmpfs,
+                                           FILE_TEST_FILENAME,
+                                           FILE_TEST_OFFSET);
+    MigrateCommon args = {
+        .connect_uri = uri,
+        .listen_uri = "defer",
+        .finish_hook = file_offset_finish_hook,
+    };
+
+    test_file_common(&args, false);
+}
+
+static void test_precopy_file_offset_bad(void)
+{
+    /* using a value not supported by qemu_strtosz() */
+    g_autofree char *uri = g_strdup_printf("file:%s/%s,offset=0x20M",
+                                           tmpfs, FILE_TEST_FILENAME);
+    MigrateCommon args = {
+        .connect_uri = uri,
+        .listen_uri = "defer",
+        .result = MIG_TEST_QMP_ERROR,
+    };
+
+    test_file_common(&args, false);
+}
+
 static void test_precopy_tcp_plain(void)
 {
     MigrateCommon args = {
@@ -2909,6 +3048,14 @@ int main(int argc, char **argv)
         qtest_add_func("/migration/precopy/unix/compress/nowait",
                        test_precopy_unix_compress_nowait);
     }
+
+    qtest_add_func("/migration/precopy/file",
+                   test_precopy_file);
+    qtest_add_func("/migration/precopy/file/offset",
+                   test_precopy_file_offset);
+    qtest_add_func("/migration/precopy/file/offset/bad",
+                   test_precopy_file_offset_bad);
+
 #ifdef CONFIG_GNUTLS
     qtest_add_func("/migration/precopy/unix/tls/psk",
                    test_precopy_unix_tls_psk);
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PULL 12/38] migration: hold the BQL during setup
  2023-10-17  8:29 [PULL 00/38] Migration 20231017 patches Juan Quintela
                   ` (10 preceding siblings ...)
  2023-10-17  8:29 ` [PULL 11/38] tests/qtest: migration-test: Add tests for file-based migration Juan Quintela
@ 2023-10-17  8:29 ` Juan Quintela
  2023-10-17  8:29 ` [PULL 13/38] migration: Non multifd migration don't care about multifd flushes Juan Quintela
                   ` (26 subsequent siblings)
  38 siblings, 0 replies; 49+ messages in thread
From: Juan Quintela @ 2023-10-17  8:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Thomas Huth, Fabiano Rosas, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Juan Quintela, Laurent Vivier,
	Fiona Ebner

From: Fiona Ebner <f.ebner@proxmox.com>

This is intended to be a semantic revert of commit 9b09503752
("migration: run setup callbacks out of big lock"). There have been so
many changes since that commit (e.g. a new setup callback
dirty_bitmap_save_setup() that also needs to be adapted now), it's
easier to do the revert manually.

For snapshots, the bdrv_writev_vmstate() function is used during setup
(in QIOChannelBlock backing the QEMUFile), but not holding the BQL
while calling it could lead to an assertion failure. To understand
how, first note the following:

1. Generated coroutine wrappers for block layer functions spawn the
coroutine and use AIO_WAIT_WHILE()/aio_poll() to wait for it.
2. If the host OS switches threads at an inconvenient time, it can
happen that a bottom half scheduled for the main thread's AioContext
is executed as part of a vCPU thread's aio_poll().

An example leading to the assertion failure is as follows:

main thread:
1. A snapshot-save QMP command gets issued.
2. snapshot_save_job_bh() is scheduled.

vCPU thread:
3. aio_poll() for the main thread's AioContext is called (e.g. when
the guest writes to a pflash device, as part of blk_pwrite which is a
generated coroutine wrapper).
4. snapshot_save_job_bh() is executed as part of aio_poll().
3. qemu_savevm_state() is called.
4. qemu_mutex_unlock_iothread() is called. Now
qemu_get_current_aio_context() returns 0x0.
5. bdrv_writev_vmstate() is executed during the usual savevm setup
via qemu_fflush(). But this function is a generated coroutine wrapper,
so it uses AIO_WAIT_WHILE. There, the assertion
assert(qemu_get_current_aio_context() == qemu_get_aio_context());
will fail.

To fix it, ensure that the BQL is held during setup. While it would
only be needed for snapshots, adapting migration too avoids additional
logic for conditional locking/unlocking in the setup callbacks.
Writing the header could (in theory) also trigger qemu_fflush() and
thus bdrv_writev_vmstate(), so the locked section also covers the
qemu_savevm_state_header() call, even for migration for consistency.

The section around multifd_send_sync_main() needs to be unlocked to
avoid a deadlock. In particular, the multifd_save_setup() function calls
socket_send_channel_create() using multifd_new_send_channel_async() as a
callback and then waits for the callback to signal via the
channels_ready semaphore. The connection happens via
qio_task_run_in_thread(), but the callback is only executed via
qio_task_thread_result() which is scheduled for the main event loop.
Without unlocking the section, the main thread would never get to
process the task result and the callback meaning there would be no
signal via the channels_ready semaphore.

The comment in ram_init_bitmaps() was introduced by 4987783400
("migration: fix incorrect memory_global_dirty_log_start outside BQL")
and is removed, because it referred to the qemu_mutex_lock_iothread()
call.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231013105839.415989-1-f.ebner@proxmox.com>
---
 include/migration/register.h   | 2 +-
 migration/block-dirty-bitmap.c | 3 ---
 migration/block.c              | 5 -----
 migration/migration.c          | 6 ++++++
 migration/ram.c                | 6 +++---
 migration/savevm.c             | 2 --
 6 files changed, 10 insertions(+), 14 deletions(-)

diff --git a/include/migration/register.h b/include/migration/register.h
index 2b12c6adec..fed1d04a3c 100644
--- a/include/migration/register.h
+++ b/include/migration/register.h
@@ -25,6 +25,7 @@ typedef struct SaveVMHandlers {
      * used to perform early checks.
      */
     int (*save_prepare)(void *opaque, Error **errp);
+    int (*save_setup)(QEMUFile *f, void *opaque);
     void (*save_cleanup)(void *opaque);
     int (*save_live_complete_postcopy)(QEMUFile *f, void *opaque);
     int (*save_live_complete_precopy)(QEMUFile *f, void *opaque);
@@ -50,7 +51,6 @@ typedef struct SaveVMHandlers {
     int (*save_live_iterate)(QEMUFile *f, void *opaque);
 
     /* This runs outside the iothread lock!  */
-    int (*save_setup)(QEMUFile *f, void *opaque);
     /* Note for save_live_pending:
      * must_precopy:
      * - must be migrated in precopy or in stopped state
diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 032fc5f405..03cb2e72ee 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -1214,9 +1214,7 @@ static int dirty_bitmap_save_setup(QEMUFile *f, void *opaque)
     DBMSaveState *s = &((DBMState *)opaque)->save;
     SaveBitmapState *dbms = NULL;
 
-    qemu_mutex_lock_iothread();
     if (init_dirty_bitmap_migration(s) < 0) {
-        qemu_mutex_unlock_iothread();
         return -1;
     }
 
@@ -1224,7 +1222,6 @@ static int dirty_bitmap_save_setup(QEMUFile *f, void *opaque)
         send_bitmap_start(f, s, dbms);
     }
     qemu_put_bitmap_flags(f, DIRTY_BITMAP_MIG_FLAG_EOS);
-    qemu_mutex_unlock_iothread();
     return 0;
 }
 
diff --git a/migration/block.c b/migration/block.c
index d115e1cfa5..b60698d6e2 100644
--- a/migration/block.c
+++ b/migration/block.c
@@ -731,18 +731,13 @@ static int block_save_setup(QEMUFile *f, void *opaque)
     trace_migration_block_save("setup", block_mig_state.submitted,
                                block_mig_state.transferred);
 
-    qemu_mutex_lock_iothread();
     ret = init_blk_migration(f);
     if (ret < 0) {
-        qemu_mutex_unlock_iothread();
         return ret;
     }
 
     /* start track dirty blocks */
     ret = set_dirty_tracking();
-
-    qemu_mutex_unlock_iothread();
-
     if (ret) {
         return ret;
     }
diff --git a/migration/migration.c b/migration/migration.c
index 98151b1424..79fa11e3f6 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -3020,7 +3020,9 @@ static void *migration_thread(void *opaque)
     object_ref(OBJECT(s));
     update_iteration_initial_status(s);
 
+    qemu_mutex_lock_iothread();
     qemu_savevm_state_header(s->to_dst_file);
+    qemu_mutex_unlock_iothread();
 
     /*
      * If we opened the return path, we need to make sure dst has it
@@ -3048,7 +3050,9 @@ static void *migration_thread(void *opaque)
         qemu_savevm_send_colo_enable(s->to_dst_file);
     }
 
+    qemu_mutex_lock_iothread();
     qemu_savevm_state_setup(s->to_dst_file);
+    qemu_mutex_unlock_iothread();
 
     qemu_savevm_wait_unplug(s, MIGRATION_STATUS_SETUP,
                                MIGRATION_STATUS_ACTIVE);
@@ -3159,8 +3163,10 @@ static void *bg_migration_thread(void *opaque)
     ram_write_tracking_prepare();
 #endif
 
+    qemu_mutex_lock_iothread();
     qemu_savevm_state_header(s->to_dst_file);
     qemu_savevm_state_setup(s->to_dst_file);
+    qemu_mutex_unlock_iothread();
 
     qemu_savevm_wait_unplug(s, MIGRATION_STATUS_SETUP,
                                MIGRATION_STATUS_ACTIVE);
diff --git a/migration/ram.c b/migration/ram.c
index e8df4dc862..d3d9c8b65b 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2891,8 +2891,6 @@ static void migration_bitmap_clear_discarded_pages(RAMState *rs)
 
 static void ram_init_bitmaps(RAMState *rs)
 {
-    /* For memory_global_dirty_log_start below.  */
-    qemu_mutex_lock_iothread();
     qemu_mutex_lock_ramlist();
 
     WITH_RCU_READ_LOCK_GUARD() {
@@ -2904,7 +2902,6 @@ static void ram_init_bitmaps(RAMState *rs)
         }
     }
     qemu_mutex_unlock_ramlist();
-    qemu_mutex_unlock_iothread();
 
     /*
      * After an eventual first bitmap sync, fixup the initial bitmap
@@ -3067,7 +3064,10 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
 
     migration_ops = g_malloc0(sizeof(MigrationOps));
     migration_ops->ram_save_target_page = ram_save_target_page_legacy;
+
+    qemu_mutex_unlock_iothread();
     ret = multifd_send_sync_main(f);
+    qemu_mutex_lock_iothread();
     if (ret < 0) {
         return ret;
     }
diff --git a/migration/savevm.c b/migration/savevm.c
index bce698b0af..8622f229e5 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1672,10 +1672,8 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp)
     }
     ms->to_dst_file = f;
 
-    qemu_mutex_unlock_iothread();
     qemu_savevm_state_header(f);
     qemu_savevm_state_setup(f);
-    qemu_mutex_lock_iothread();
 
     while (qemu_file_get_error(f) == 0) {
         if (qemu_savevm_state_iterate(f, false) > 0) {
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PULL 13/38] migration: Non multifd migration don't care about multifd flushes
  2023-10-17  8:29 [PULL 00/38] Migration 20231017 patches Juan Quintela
                   ` (11 preceding siblings ...)
  2023-10-17  8:29 ` [PULL 12/38] migration: hold the BQL during setup Juan Quintela
@ 2023-10-17  8:29 ` Juan Quintela
  2023-10-19 11:47   ` Michael Tokarev
  2023-10-17  8:29 ` [PULL 14/38] migration: Create migrate_rdma() Juan Quintela
                   ` (25 subsequent siblings)
  38 siblings, 1 reply; 49+ messages in thread
From: Juan Quintela @ 2023-10-17  8:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Thomas Huth, Fabiano Rosas, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Juan Quintela, Laurent Vivier

RDMA was having trouble because
migrate_multifd_flush_after_each_section() can only be true or false,
but we don't want to send any flush when we are not in multifd
migration.

CC: Fabiano Rosas <farosas@suse.de
Fixes: 294e5a4034e81 ("multifd: Only flush once each full round of memory")

Reported-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231011205548.10571-2-quintela@redhat.com>
---
 migration/ram.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index d3d9c8b65b..acb8f95f00 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1395,7 +1395,8 @@ static int find_dirty_block(RAMState *rs, PageSearchStatus *pss)
         pss->page = 0;
         pss->block = QLIST_NEXT_RCU(pss->block, next);
         if (!pss->block) {
-            if (!migrate_multifd_flush_after_each_section()) {
+            if (migrate_multifd() &&
+                !migrate_multifd_flush_after_each_section()) {
                 QEMUFile *f = rs->pss[RAM_CHANNEL_PRECOPY].pss_channel;
                 int ret = multifd_send_sync_main(f);
                 if (ret < 0) {
@@ -3072,7 +3073,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
         return ret;
     }
 
-    if (!migrate_multifd_flush_after_each_section()) {
+    if (migrate_multifd() && !migrate_multifd_flush_after_each_section()) {
         qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
     }
 
@@ -3184,7 +3185,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
 out:
     if (ret >= 0
         && migration_is_setup_or_active(migrate_get_current()->state)) {
-        if (migrate_multifd_flush_after_each_section()) {
+        if (migrate_multifd() && migrate_multifd_flush_after_each_section()) {
             ret = multifd_send_sync_main(rs->pss[RAM_CHANNEL_PRECOPY].pss_channel);
             if (ret < 0) {
                 return ret;
@@ -3261,7 +3262,7 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
         return ret;
     }
 
-    if (!migrate_multifd_flush_after_each_section()) {
+    if (migrate_multifd() && !migrate_multifd_flush_after_each_section()) {
         qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
     }
     qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
@@ -3768,7 +3769,8 @@ int ram_load_postcopy(QEMUFile *f, int channel)
             break;
         case RAM_SAVE_FLAG_EOS:
             /* normal exit */
-            if (migrate_multifd_flush_after_each_section()) {
+            if (migrate_multifd() &&
+                migrate_multifd_flush_after_each_section()) {
                 multifd_recv_sync_main();
             }
             break;
@@ -4046,7 +4048,8 @@ static int ram_load_precopy(QEMUFile *f)
             break;
         case RAM_SAVE_FLAG_EOS:
             /* normal exit */
-            if (migrate_multifd_flush_after_each_section()) {
+            if (migrate_multifd() &&
+                migrate_multifd_flush_after_each_section()) {
                 multifd_recv_sync_main();
             }
             break;
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PULL 14/38] migration: Create migrate_rdma()
  2023-10-17  8:29 [PULL 00/38] Migration 20231017 patches Juan Quintela
                   ` (12 preceding siblings ...)
  2023-10-17  8:29 ` [PULL 13/38] migration: Non multifd migration don't care about multifd flushes Juan Quintela
@ 2023-10-17  8:29 ` Juan Quintela
  2023-10-17  8:29 ` [PULL 15/38] migration/rdma: Unfold ram_control_before_iterate() Juan Quintela
                   ` (24 subsequent siblings)
  38 siblings, 0 replies; 49+ messages in thread
From: Juan Quintela @ 2023-10-17  8:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Thomas Huth, Fabiano Rosas, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Juan Quintela, Laurent Vivier

Helper to say if we are doing a migration over rdma.

Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231011203527.9061-2-quintela@redhat.com>
---
 migration/migration.h | 2 ++
 migration/options.h   | 1 +
 migration/migration.c | 1 +
 migration/options.c   | 7 +++++++
 migration/rdma.c      | 4 +++-
 5 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/migration/migration.h b/migration/migration.h
index 974897a8d0..ae82004892 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -469,6 +469,8 @@ struct MigrationState {
      * switchover has been received.
      */
     bool switchover_acked;
+    /* Is this a rdma migration */
+    bool rdma_migration;
 };
 
 void migrate_set_state(int *state, int old_state, int new_state);
diff --git a/migration/options.h b/migration/options.h
index 93ee938ab8..237f2d6b4a 100644
--- a/migration/options.h
+++ b/migration/options.h
@@ -56,6 +56,7 @@ bool migrate_zero_copy_send(void);
 
 bool migrate_multifd_flush_after_each_section(void);
 bool migrate_postcopy(void);
+bool migrate_rdma(void);
 bool migrate_tls(void);
 
 /* capabilities helpers */
diff --git a/migration/migration.c b/migration/migration.c
index 79fa11e3f6..6ba5e145ac 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1452,6 +1452,7 @@ int migrate_init(MigrationState *s, Error **errp)
     s->iteration_initial_bytes = 0;
     s->threshold_size = 0;
     s->switchover_acked = false;
+    s->rdma_migration = false;
     /*
      * set mig_stats compression_counters memory to zero for a
      * new migration
diff --git a/migration/options.c b/migration/options.c
index 546cbe3106..42fb818956 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -378,6 +378,13 @@ bool migrate_postcopy(void)
     return migrate_postcopy_ram() || migrate_dirty_bitmaps();
 }
 
+bool migrate_rdma(void)
+{
+    MigrationState *s = migrate_get_current();
+
+    return s->rdma_migration;
+}
+
 bool migrate_tls(void)
 {
     MigrationState *s = migrate_get_current();
diff --git a/migration/rdma.c b/migration/rdma.c
index f6fc226c9b..f155f3e1c8 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -4113,6 +4113,7 @@ static void rdma_accept_incoming_migration(void *opaque)
 
 void rdma_start_incoming_migration(const char *host_port, Error **errp)
 {
+    MigrationState *s = migrate_get_current();
     int ret;
     RDMAContext *rdma;
 
@@ -4144,7 +4145,7 @@ void rdma_start_incoming_migration(const char *host_port, Error **errp)
     }
 
     trace_rdma_start_incoming_migration_after_rdma_listen();
-
+    s->rdma_migration = true;
     qemu_set_fd_handler(rdma->channel->fd, rdma_accept_incoming_migration,
                         NULL, (void *)(intptr_t)rdma);
     return;
@@ -4220,6 +4221,7 @@ void rdma_start_outgoing_migration(void *opaque,
     trace_rdma_start_outgoing_migration_after_rdma_connect();
 
     s->to_dst_file = rdma_new_output(rdma);
+    s->rdma_migration = true;
     migrate_fd_connect(s, NULL);
     return;
 return_path_err:
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PULL 15/38] migration/rdma: Unfold ram_control_before_iterate()
  2023-10-17  8:29 [PULL 00/38] Migration 20231017 patches Juan Quintela
                   ` (13 preceding siblings ...)
  2023-10-17  8:29 ` [PULL 14/38] migration: Create migrate_rdma() Juan Quintela
@ 2023-10-17  8:29 ` Juan Quintela
  2023-10-17  8:29 ` [PULL 16/38] migration/rdma: Unfold ram_control_after_iterate() Juan Quintela
                   ` (23 subsequent siblings)
  38 siblings, 0 replies; 49+ messages in thread
From: Juan Quintela @ 2023-10-17  8:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Thomas Huth, Fabiano Rosas, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Juan Quintela, Laurent Vivier

Once there:
- Remove unused data parameter
- unfold it in its callers.
- change all callers to call qemu_rdma_registration_start()
- We need to call QIO_CHANNEL_RDMA() after we check for migrate_rdma()

Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231011203527.9061-3-quintela@redhat.com>
---
 migration/qemu-file.h |  2 --
 migration/rdma.h      |  7 +++++++
 migration/qemu-file.c | 13 +------------
 migration/ram.c       | 16 +++++++++++++---
 migration/rdma.c      | 12 ++++--------
 5 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/migration/qemu-file.h b/migration/qemu-file.h
index 03e718c264..d6a370c569 100644
--- a/migration/qemu-file.h
+++ b/migration/qemu-file.h
@@ -55,7 +55,6 @@ typedef int (QEMURamSaveFunc)(QEMUFile *f,
                               size_t size);
 
 typedef struct QEMUFileHooks {
-    QEMURamHookFunc *before_ram_iterate;
     QEMURamHookFunc *after_ram_iterate;
     QEMURamHookFunc *hook_ram_load;
     QEMURamSaveFunc *save_page;
@@ -127,7 +126,6 @@ void qemu_fflush(QEMUFile *f);
 void qemu_file_set_blocking(QEMUFile *f, bool block);
 int qemu_file_get_to_fd(QEMUFile *f, int fd, size_t size);
 
-void ram_control_before_iterate(QEMUFile *f, uint64_t flags);
 void ram_control_after_iterate(QEMUFile *f, uint64_t flags);
 void ram_control_load_hook(QEMUFile *f, uint64_t flags, void *data);
 
diff --git a/migration/rdma.h b/migration/rdma.h
index de2ba09dc5..670c67a8cb 100644
--- a/migration/rdma.h
+++ b/migration/rdma.h
@@ -22,4 +22,11 @@ void rdma_start_outgoing_migration(void *opaque, const char *host_port,
 
 void rdma_start_incoming_migration(const char *host_port, Error **errp);
 
+
+#ifdef CONFIG_RDMA
+int qemu_rdma_registration_start(QEMUFile *f, uint64_t flags);
+#else
+static inline
+int qemu_rdma_registration_start(QEMUFile *f, uint64_t flags) { return 0; }
+#endif
 #endif
diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index 7fb659296f..5e2d73fd68 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -32,6 +32,7 @@
 #include "trace.h"
 #include "options.h"
 #include "qapi/error.h"
+#include "rdma.h"
 
 #define IO_BUF_SIZE 32768
 #define MAX_IOV_SIZE MIN_CONST(IOV_MAX, 64)
@@ -297,18 +298,6 @@ void qemu_fflush(QEMUFile *f)
     f->iovcnt = 0;
 }
 
-void ram_control_before_iterate(QEMUFile *f, uint64_t flags)
-{
-    int ret = 0;
-
-    if (f->hooks && f->hooks->before_ram_iterate) {
-        ret = f->hooks->before_ram_iterate(f, flags, NULL);
-        if (ret < 0) {
-            qemu_file_set_error(f, ret);
-        }
-    }
-}
-
 void ram_control_after_iterate(QEMUFile *f, uint64_t flags)
 {
     int ret = 0;
diff --git a/migration/ram.c b/migration/ram.c
index acb8f95f00..6592431a4e 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -59,6 +59,7 @@
 #include "qemu/iov.h"
 #include "multifd.h"
 #include "sysemu/runstate.h"
+#include "rdma.h"
 #include "options.h"
 #include "sysemu/dirtylimit.h"
 #include "sysemu/kvm.h"
@@ -3060,7 +3061,10 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
         }
     }
 
-    ram_control_before_iterate(f, RAM_CONTROL_SETUP);
+    ret = qemu_rdma_registration_start(f, RAM_CONTROL_SETUP);
+    if (ret < 0) {
+        qemu_file_set_error(f, ret);
+    }
     ram_control_after_iterate(f, RAM_CONTROL_SETUP);
 
     migration_ops = g_malloc0(sizeof(MigrationOps));
@@ -3123,7 +3127,10 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
         /* Read version before ram_list.blocks */
         smp_rmb();
 
-        ram_control_before_iterate(f, RAM_CONTROL_ROUND);
+        ret = qemu_rdma_registration_start(f, RAM_CONTROL_ROUND);
+        if (ret < 0) {
+            qemu_file_set_error(f, ret);
+        }
 
         t0 = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
         i = 0;
@@ -3228,7 +3235,10 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
             migration_bitmap_sync_precopy(rs, true);
         }
 
-        ram_control_before_iterate(f, RAM_CONTROL_FINISH);
+        ret = qemu_rdma_registration_start(f, RAM_CONTROL_FINISH);
+        if (ret < 0) {
+            qemu_file_set_error(f, ret);
+        }
 
         /* try transferring iterative blocks of memory */
 
diff --git a/migration/rdma.c b/migration/rdma.c
index f155f3e1c8..3d74ad6db0 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -3850,18 +3850,15 @@ static int rdma_load_hook(QEMUFile *f, uint64_t flags, void *data)
     }
 }
 
-static int qemu_rdma_registration_start(QEMUFile *f,
-                                        uint64_t flags, void *data)
+int qemu_rdma_registration_start(QEMUFile *f, uint64_t flags)
 {
-    QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(qemu_file_get_ioc(f));
-    RDMAContext *rdma;
-
-    if (migration_in_postcopy()) {
+    if (!migrate_rdma() || migration_in_postcopy()) {
         return 0;
     }
 
+    QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(qemu_file_get_ioc(f));
     RCU_READ_LOCK_GUARD();
-    rdma = qatomic_rcu_read(&rioc->rdmaout);
+    RDMAContext *rdma = qatomic_rcu_read(&rioc->rdmaout);
     if (!rdma) {
         return -1;
     }
@@ -4002,7 +3999,6 @@ static const QEMUFileHooks rdma_read_hooks = {
 };
 
 static const QEMUFileHooks rdma_write_hooks = {
-    .before_ram_iterate = qemu_rdma_registration_start,
     .after_ram_iterate  = qemu_rdma_registration_stop,
     .save_page          = qemu_rdma_save_page,
 };
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PULL 16/38] migration/rdma: Unfold ram_control_after_iterate()
  2023-10-17  8:29 [PULL 00/38] Migration 20231017 patches Juan Quintela
                   ` (14 preceding siblings ...)
  2023-10-17  8:29 ` [PULL 15/38] migration/rdma: Unfold ram_control_before_iterate() Juan Quintela
@ 2023-10-17  8:29 ` Juan Quintela
  2023-10-17  8:29 ` [PULL 17/38] migration/rdma: Remove all uses of RAM_CONTROL_HOOK Juan Quintela
                   ` (22 subsequent siblings)
  38 siblings, 0 replies; 49+ messages in thread
From: Juan Quintela @ 2023-10-17  8:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Thomas Huth, Fabiano Rosas, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Juan Quintela, Laurent Vivier

Once there:
- Remove unused data parameter
- unfold it in its callers
- change all callers to call qemu_rdma_registration_stop()
- We need to call QIO_CHANNEL_RDMA() after we check for migrate_rdma()

Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231011203527.9061-4-quintela@redhat.com>
---
 migration/qemu-file.h |  2 --
 migration/rdma.h      |  3 +++
 migration/qemu-file.c | 12 ------------
 migration/ram.c       | 17 ++++++++++++++---
 migration/rdma.c      |  9 ++++-----
 5 files changed, 21 insertions(+), 22 deletions(-)

diff --git a/migration/qemu-file.h b/migration/qemu-file.h
index d6a370c569..35e671a01e 100644
--- a/migration/qemu-file.h
+++ b/migration/qemu-file.h
@@ -55,7 +55,6 @@ typedef int (QEMURamSaveFunc)(QEMUFile *f,
                               size_t size);
 
 typedef struct QEMUFileHooks {
-    QEMURamHookFunc *after_ram_iterate;
     QEMURamHookFunc *hook_ram_load;
     QEMURamSaveFunc *save_page;
 } QEMUFileHooks;
@@ -126,7 +125,6 @@ void qemu_fflush(QEMUFile *f);
 void qemu_file_set_blocking(QEMUFile *f, bool block);
 int qemu_file_get_to_fd(QEMUFile *f, int fd, size_t size);
 
-void ram_control_after_iterate(QEMUFile *f, uint64_t flags);
 void ram_control_load_hook(QEMUFile *f, uint64_t flags, void *data);
 
 /* Whenever this is found in the data stream, the flags
diff --git a/migration/rdma.h b/migration/rdma.h
index 670c67a8cb..c13b94c782 100644
--- a/migration/rdma.h
+++ b/migration/rdma.h
@@ -25,8 +25,11 @@ void rdma_start_incoming_migration(const char *host_port, Error **errp);
 
 #ifdef CONFIG_RDMA
 int qemu_rdma_registration_start(QEMUFile *f, uint64_t flags);
+int qemu_rdma_registration_stop(QEMUFile *f, uint64_t flags);
 #else
 static inline
 int qemu_rdma_registration_start(QEMUFile *f, uint64_t flags) { return 0; }
+static inline
+int qemu_rdma_registration_stop(QEMUFile *f, uint64_t flags) { return 0; }
 #endif
 #endif
diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index 5e2d73fd68..e7dba2a849 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -298,18 +298,6 @@ void qemu_fflush(QEMUFile *f)
     f->iovcnt = 0;
 }
 
-void ram_control_after_iterate(QEMUFile *f, uint64_t flags)
-{
-    int ret = 0;
-
-    if (f->hooks && f->hooks->after_ram_iterate) {
-        ret = f->hooks->after_ram_iterate(f, flags, NULL);
-        if (ret < 0) {
-            qemu_file_set_error(f, ret);
-        }
-    }
-}
-
 void ram_control_load_hook(QEMUFile *f, uint64_t flags, void *data)
 {
     if (f->hooks && f->hooks->hook_ram_load) {
diff --git a/migration/ram.c b/migration/ram.c
index 6592431a4e..f1ddc1f9fa 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -3065,7 +3065,11 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
     if (ret < 0) {
         qemu_file_set_error(f, ret);
     }
-    ram_control_after_iterate(f, RAM_CONTROL_SETUP);
+
+    ret = qemu_rdma_registration_stop(f, RAM_CONTROL_SETUP);
+    if (ret < 0) {
+        qemu_file_set_error(f, ret);
+    }
 
     migration_ops = g_malloc0(sizeof(MigrationOps));
     migration_ops->ram_save_target_page = ram_save_target_page_legacy;
@@ -3187,7 +3191,10 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
      * Must occur before EOS (or any QEMUFile operation)
      * because of RDMA protocol.
      */
-    ram_control_after_iterate(f, RAM_CONTROL_ROUND);
+    ret = qemu_rdma_registration_stop(f, RAM_CONTROL_ROUND);
+    if (ret < 0) {
+        qemu_file_set_error(f, ret);
+    }
 
 out:
     if (ret >= 0
@@ -3260,7 +3267,11 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
         qemu_mutex_unlock(&rs->bitmap_mutex);
 
         ram_flush_compressed_data(rs);
-        ram_control_after_iterate(f, RAM_CONTROL_FINISH);
+
+        int ret = qemu_rdma_registration_stop(f, RAM_CONTROL_FINISH);
+        if (ret < 0) {
+            qemu_file_set_error(f, ret);
+        }
     }
 
     if (ret < 0) {
diff --git a/migration/rdma.c b/migration/rdma.c
index 3d74ad6db0..4b32d375ec 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -3878,20 +3878,20 @@ int qemu_rdma_registration_start(QEMUFile *f, uint64_t flags)
  * Inform dest that dynamic registrations are done for now.
  * First, flush writes, if any.
  */
-static int qemu_rdma_registration_stop(QEMUFile *f,
-                                       uint64_t flags, void *data)
+int qemu_rdma_registration_stop(QEMUFile *f, uint64_t flags)
 {
-    QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(qemu_file_get_ioc(f));
+    QIOChannelRDMA *rioc;
     Error *err = NULL;
     RDMAContext *rdma;
     RDMAControlHeader head = { .len = 0, .repeat = 1 };
     int ret;
 
-    if (migration_in_postcopy()) {
+    if (!migrate_rdma() || migration_in_postcopy()) {
         return 0;
     }
 
     RCU_READ_LOCK_GUARD();
+    rioc = QIO_CHANNEL_RDMA(qemu_file_get_ioc(f));
     rdma = qatomic_rcu_read(&rioc->rdmaout);
     if (!rdma) {
         return -1;
@@ -3999,7 +3999,6 @@ static const QEMUFileHooks rdma_read_hooks = {
 };
 
 static const QEMUFileHooks rdma_write_hooks = {
-    .after_ram_iterate  = qemu_rdma_registration_stop,
     .save_page          = qemu_rdma_save_page,
 };
 
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PULL 17/38] migration/rdma: Remove all uses of RAM_CONTROL_HOOK
  2023-10-17  8:29 [PULL 00/38] Migration 20231017 patches Juan Quintela
                   ` (15 preceding siblings ...)
  2023-10-17  8:29 ` [PULL 16/38] migration/rdma: Unfold ram_control_after_iterate() Juan Quintela
@ 2023-10-17  8:29 ` Juan Quintela
  2023-10-17  8:29 ` [PULL 18/38] migration/rdma: Unfold hook_ram_load() Juan Quintela
                   ` (21 subsequent siblings)
  38 siblings, 0 replies; 49+ messages in thread
From: Juan Quintela @ 2023-10-17  8:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Thomas Huth, Fabiano Rosas, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Juan Quintela, Laurent Vivier

Instead of going through ram_control_load_hook(), call
qemu_rdma_registration_handle() directly.

Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231011203527.9061-5-quintela@redhat.com>
---
 migration/qemu-file.h |  1 -
 migration/rdma.h      |  3 +++
 migration/ram.c       |  5 ++++-
 migration/rdma.c      | 12 +++++++-----
 4 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/migration/qemu-file.h b/migration/qemu-file.h
index 35e671a01e..14ff0d9cc4 100644
--- a/migration/qemu-file.h
+++ b/migration/qemu-file.h
@@ -41,7 +41,6 @@ typedef int (QEMURamHookFunc)(QEMUFile *f, uint64_t flags, void *data);
  */
 #define RAM_CONTROL_SETUP     0
 #define RAM_CONTROL_ROUND     1
-#define RAM_CONTROL_HOOK      2
 #define RAM_CONTROL_FINISH    3
 #define RAM_CONTROL_BLOCK_REG 4
 
diff --git a/migration/rdma.h b/migration/rdma.h
index c13b94c782..8bd277efb9 100644
--- a/migration/rdma.h
+++ b/migration/rdma.h
@@ -24,10 +24,13 @@ void rdma_start_incoming_migration(const char *host_port, Error **errp);
 
 
 #ifdef CONFIG_RDMA
+int qemu_rdma_registration_handle(QEMUFile *f);
 int qemu_rdma_registration_start(QEMUFile *f, uint64_t flags);
 int qemu_rdma_registration_stop(QEMUFile *f, uint64_t flags);
 #else
 static inline
+int qemu_rdma_registration_handle(QEMUFile *f) { return 0; }
+static inline
 int qemu_rdma_registration_start(QEMUFile *f, uint64_t flags) { return 0; }
 static inline
 int qemu_rdma_registration_stop(QEMUFile *f, uint64_t flags) { return 0; }
diff --git a/migration/ram.c b/migration/ram.c
index f1ddc1f9fa..f6ea1831b5 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -4075,7 +4075,10 @@ static int ram_load_precopy(QEMUFile *f)
             }
             break;
         case RAM_SAVE_FLAG_HOOK:
-            ram_control_load_hook(f, RAM_CONTROL_HOOK, NULL);
+            ret = qemu_rdma_registration_handle(f);
+            if (ret < 0) {
+                qemu_file_set_error(f, ret);
+            }
             break;
         default:
             error_report("Unknown combination of migration flags: 0x%x", flags);
diff --git a/migration/rdma.c b/migration/rdma.c
index 4b32d375ec..5c20f425a9 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -3522,7 +3522,7 @@ static int dest_ram_sort_func(const void *a, const void *b)
  *
  * Keep doing this until the source tells us to stop.
  */
-static int qemu_rdma_registration_handle(QEMUFile *f)
+int qemu_rdma_registration_handle(QEMUFile *f)
 {
     RDMAControlHeader reg_resp = { .len = sizeof(RDMARegisterResult),
                                .type = RDMA_CONTROL_REGISTER_RESULT,
@@ -3534,7 +3534,7 @@ static int qemu_rdma_registration_handle(QEMUFile *f)
                              };
     RDMAControlHeader blocks = { .type = RDMA_CONTROL_RAM_BLOCKS_RESULT,
                                  .repeat = 1 };
-    QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(qemu_file_get_ioc(f));
+    QIOChannelRDMA *rioc;
     Error *err = NULL;
     RDMAContext *rdma;
     RDMALocalBlocks *local;
@@ -3550,7 +3550,12 @@ static int qemu_rdma_registration_handle(QEMUFile *f)
     int count = 0;
     int i = 0;
 
+    if (!migrate_rdma()) {
+        return 0;
+    }
+
     RCU_READ_LOCK_GUARD();
+    rioc = QIO_CHANNEL_RDMA(qemu_file_get_ioc(f));
     rdma = qatomic_rcu_read(&rioc->rdmain);
 
     if (!rdma) {
@@ -3841,9 +3846,6 @@ static int rdma_load_hook(QEMUFile *f, uint64_t flags, void *data)
     case RAM_CONTROL_BLOCK_REG:
         return rdma_block_notification_handle(f, data);
 
-    case RAM_CONTROL_HOOK:
-        return qemu_rdma_registration_handle(f);
-
     default:
         /* Shouldn't be called with any other values */
         abort();
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PULL 18/38] migration/rdma: Unfold hook_ram_load()
  2023-10-17  8:29 [PULL 00/38] Migration 20231017 patches Juan Quintela
                   ` (16 preceding siblings ...)
  2023-10-17  8:29 ` [PULL 17/38] migration/rdma: Remove all uses of RAM_CONTROL_HOOK Juan Quintela
@ 2023-10-17  8:29 ` Juan Quintela
  2023-10-17  8:29 ` [PULL 19/38] migration/rdma: Create rdma_control_save_page() Juan Quintela
                   ` (20 subsequent siblings)
  38 siblings, 0 replies; 49+ messages in thread
From: Juan Quintela @ 2023-10-17  8:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Thomas Huth, Fabiano Rosas, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Juan Quintela, Laurent Vivier

There is only one flag called with: RAM_CONTROL_BLOCK_REG.

Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231011203527.9061-6-quintela@redhat.com>
---
 migration/qemu-file.h | 11 -----------
 migration/rdma.h      |  3 +++
 migration/qemu-file.c | 10 ----------
 migration/ram.c       |  6 ++++--
 migration/rdma.c      | 34 +++++++++++-----------------------
 5 files changed, 18 insertions(+), 46 deletions(-)

diff --git a/migration/qemu-file.h b/migration/qemu-file.h
index 14ff0d9cc4..80c30631dc 100644
--- a/migration/qemu-file.h
+++ b/migration/qemu-file.h
@@ -29,20 +29,12 @@
 #include "exec/cpu-common.h"
 #include "io/channel.h"
 
-/*
- * This function provides hooks around different
- * stages of RAM migration.
- * 'data' is call specific data associated with the 'flags' value
- */
-typedef int (QEMURamHookFunc)(QEMUFile *f, uint64_t flags, void *data);
-
 /*
  * Constants used by ram_control_* hooks
  */
 #define RAM_CONTROL_SETUP     0
 #define RAM_CONTROL_ROUND     1
 #define RAM_CONTROL_FINISH    3
-#define RAM_CONTROL_BLOCK_REG 4
 
 /*
  * This function allows override of where the RAM page
@@ -54,7 +46,6 @@ typedef int (QEMURamSaveFunc)(QEMUFile *f,
                               size_t size);
 
 typedef struct QEMUFileHooks {
-    QEMURamHookFunc *hook_ram_load;
     QEMURamSaveFunc *save_page;
 } QEMUFileHooks;
 
@@ -124,8 +115,6 @@ void qemu_fflush(QEMUFile *f);
 void qemu_file_set_blocking(QEMUFile *f, bool block);
 int qemu_file_get_to_fd(QEMUFile *f, int fd, size_t size);
 
-void ram_control_load_hook(QEMUFile *f, uint64_t flags, void *data);
-
 /* Whenever this is found in the data stream, the flags
  * will be passed to ram_control_load_hook in the incoming-migration
  * side. This lets before_ram_iterate/after_ram_iterate add
diff --git a/migration/rdma.h b/migration/rdma.h
index 8bd277efb9..8df8b4089a 100644
--- a/migration/rdma.h
+++ b/migration/rdma.h
@@ -27,6 +27,7 @@ void rdma_start_incoming_migration(const char *host_port, Error **errp);
 int qemu_rdma_registration_handle(QEMUFile *f);
 int qemu_rdma_registration_start(QEMUFile *f, uint64_t flags);
 int qemu_rdma_registration_stop(QEMUFile *f, uint64_t flags);
+int rdma_block_notification_handle(QEMUFile *f, const char *name);
 #else
 static inline
 int qemu_rdma_registration_handle(QEMUFile *f) { return 0; }
@@ -34,5 +35,7 @@ static inline
 int qemu_rdma_registration_start(QEMUFile *f, uint64_t flags) { return 0; }
 static inline
 int qemu_rdma_registration_stop(QEMUFile *f, uint64_t flags) { return 0; }
+static inline
+int rdma_block_notification_handle(QEMUFile *f, const char *name) { return 0; }
 #endif
 #endif
diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index e7dba2a849..4a414b8976 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -298,16 +298,6 @@ void qemu_fflush(QEMUFile *f)
     f->iovcnt = 0;
 }
 
-void ram_control_load_hook(QEMUFile *f, uint64_t flags, void *data)
-{
-    if (f->hooks && f->hooks->hook_ram_load) {
-        int ret = f->hooks->hook_ram_load(f, flags, data);
-        if (ret < 0) {
-            qemu_file_set_error(f, ret);
-        }
-    }
-}
-
 int ram_control_save_page(QEMUFile *f, ram_addr_t block_offset,
                           ram_addr_t offset, size_t size)
 {
diff --git a/migration/ram.c b/migration/ram.c
index f6ea1831b5..8c462276cd 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -4025,8 +4025,10 @@ static int ram_load_precopy(QEMUFile *f)
                             ret = -EINVAL;
                         }
                     }
-                    ram_control_load_hook(f, RAM_CONTROL_BLOCK_REG,
-                                          block->idstr);
+                    ret = rdma_block_notification_handle(f, block->idstr);
+                    if (ret < 0) {
+                        qemu_file_set_error(f, ret);
+                    }
                 } else {
                     error_report("Unknown ramblock \"%s\", cannot "
                                  "accept migration", id);
diff --git a/migration/rdma.c b/migration/rdma.c
index 5c20f425a9..0b1cb03b2b 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -3799,22 +3799,23 @@ err:
 }
 
 /* Destination:
- * Called via a ram_control_load_hook during the initial RAM load section which
- * lists the RAMBlocks by name.  This lets us know the order of the RAMBlocks
- * on the source.
- * We've already built our local RAMBlock list, but not yet sent the list to
- * the source.
+ * Called during the initial RAM load section which lists the
+ * RAMBlocks by name.  This lets us know the order of the RAMBlocks on
+ * the source.  We've already built our local RAMBlock list, but not
+ * yet sent the list to the source.
  */
-static int
-rdma_block_notification_handle(QEMUFile *f, const char *name)
+int rdma_block_notification_handle(QEMUFile *f, const char *name)
 {
-    RDMAContext *rdma;
-    QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(qemu_file_get_ioc(f));
     int curr;
     int found = -1;
 
+    if (!migrate_rdma()) {
+        return 0;
+    }
+
     RCU_READ_LOCK_GUARD();
-    rdma = qatomic_rcu_read(&rioc->rdmain);
+    QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(qemu_file_get_ioc(f));
+    RDMAContext *rdma = qatomic_rcu_read(&rioc->rdmain);
 
     if (!rdma) {
         return -1;
@@ -3840,18 +3841,6 @@ rdma_block_notification_handle(QEMUFile *f, const char *name)
     return 0;
 }
 
-static int rdma_load_hook(QEMUFile *f, uint64_t flags, void *data)
-{
-    switch (flags) {
-    case RAM_CONTROL_BLOCK_REG:
-        return rdma_block_notification_handle(f, data);
-
-    default:
-        /* Shouldn't be called with any other values */
-        abort();
-    }
-}
-
 int qemu_rdma_registration_start(QEMUFile *f, uint64_t flags)
 {
     if (!migrate_rdma() || migration_in_postcopy()) {
@@ -3997,7 +3986,6 @@ err:
 }
 
 static const QEMUFileHooks rdma_read_hooks = {
-    .hook_ram_load = rdma_load_hook,
 };
 
 static const QEMUFileHooks rdma_write_hooks = {
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PULL 19/38] migration/rdma: Create rdma_control_save_page()
  2023-10-17  8:29 [PULL 00/38] Migration 20231017 patches Juan Quintela
                   ` (17 preceding siblings ...)
  2023-10-17  8:29 ` [PULL 18/38] migration/rdma: Unfold hook_ram_load() Juan Quintela
@ 2023-10-17  8:29 ` Juan Quintela
  2023-10-17  8:29 ` [PULL 20/38] qemu-file: Remove QEMUFileHooks Juan Quintela
                   ` (19 subsequent siblings)
  38 siblings, 0 replies; 49+ messages in thread
From: Juan Quintela @ 2023-10-17  8:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Thomas Huth, Fabiano Rosas, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Juan Quintela, Laurent Vivier

The only user of ram_control_save_page() and save_page() hook was
rdma. Just move the function to rdma.c, rename it to
rdma_control_save_page().

Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231011203527.9061-7-quintela@redhat.com>
---
 migration/qemu-file.h | 12 ------------
 migration/rdma.h      | 10 ++++++++++
 migration/qemu-file.c | 20 --------------------
 migration/ram.c       |  4 ++--
 migration/rdma.c      | 19 ++++++++++++++++++-
 5 files changed, 30 insertions(+), 35 deletions(-)

diff --git a/migration/qemu-file.h b/migration/qemu-file.h
index 80c30631dc..60510a2819 100644
--- a/migration/qemu-file.h
+++ b/migration/qemu-file.h
@@ -36,17 +36,7 @@
 #define RAM_CONTROL_ROUND     1
 #define RAM_CONTROL_FINISH    3
 
-/*
- * This function allows override of where the RAM page
- * is saved (such as RDMA, for example.)
- */
-typedef int (QEMURamSaveFunc)(QEMUFile *f,
-                              ram_addr_t block_offset,
-                              ram_addr_t offset,
-                              size_t size);
-
 typedef struct QEMUFileHooks {
-    QEMURamSaveFunc *save_page;
 } QEMUFileHooks;
 
 QEMUFile *qemu_file_new_input(QIOChannel *ioc);
@@ -125,8 +115,6 @@ int qemu_file_get_to_fd(QEMUFile *f, int fd, size_t size);
 #define RAM_SAVE_CONTROL_NOT_SUPP -1000
 #define RAM_SAVE_CONTROL_DELAYED  -2000
 
-int ram_control_save_page(QEMUFile *f, ram_addr_t block_offset,
-                          ram_addr_t offset, size_t size);
 QIOChannel *qemu_file_get_ioc(QEMUFile *file);
 
 #endif
diff --git a/migration/rdma.h b/migration/rdma.h
index 8df8b4089a..09a16c1e3c 100644
--- a/migration/rdma.h
+++ b/migration/rdma.h
@@ -17,6 +17,8 @@
 #ifndef QEMU_MIGRATION_RDMA_H
 #define QEMU_MIGRATION_RDMA_H
 
+#include "exec/memory.h"
+
 void rdma_start_outgoing_migration(void *opaque, const char *host_port,
                                    Error **errp);
 
@@ -28,6 +30,8 @@ int qemu_rdma_registration_handle(QEMUFile *f);
 int qemu_rdma_registration_start(QEMUFile *f, uint64_t flags);
 int qemu_rdma_registration_stop(QEMUFile *f, uint64_t flags);
 int rdma_block_notification_handle(QEMUFile *f, const char *name);
+int rdma_control_save_page(QEMUFile *f, ram_addr_t block_offset,
+                           ram_addr_t offset, size_t size);
 #else
 static inline
 int qemu_rdma_registration_handle(QEMUFile *f) { return 0; }
@@ -37,5 +41,11 @@ static inline
 int qemu_rdma_registration_stop(QEMUFile *f, uint64_t flags) { return 0; }
 static inline
 int rdma_block_notification_handle(QEMUFile *f, const char *name) { return 0; }
+static inline
+int rdma_control_save_page(QEMUFile *f, ram_addr_t block_offset,
+                           ram_addr_t offset, size_t size)
+{
+    return RAM_SAVE_CONTROL_NOT_SUPP;
+}
 #endif
 #endif
diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index 4a414b8976..745eaf7a5b 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -298,26 +298,6 @@ void qemu_fflush(QEMUFile *f)
     f->iovcnt = 0;
 }
 
-int ram_control_save_page(QEMUFile *f, ram_addr_t block_offset,
-                          ram_addr_t offset, size_t size)
-{
-    if (f->hooks && f->hooks->save_page) {
-        int ret = f->hooks->save_page(f, block_offset, offset, size);
-        /*
-         * RAM_SAVE_CONTROL_* are negative values
-         */
-        if (ret != RAM_SAVE_CONTROL_DELAYED &&
-            ret != RAM_SAVE_CONTROL_NOT_SUPP) {
-            if (ret < 0) {
-                qemu_file_set_error(f, ret);
-            }
-        }
-        return ret;
-    }
-
-    return RAM_SAVE_CONTROL_NOT_SUPP;
-}
-
 /*
  * Attempt to fill the buffer from the underlying file
  * Returns the number of bytes read, or negative value for an error.
diff --git a/migration/ram.c b/migration/ram.c
index 8c462276cd..3b4b09f6ff 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1197,8 +1197,8 @@ static bool control_save_page(PageSearchStatus *pss, RAMBlock *block,
 {
     int ret;
 
-    ret = ram_control_save_page(pss->pss_channel, block->offset, offset,
-                                TARGET_PAGE_SIZE);
+    ret = rdma_control_save_page(pss->pss_channel, block->offset, offset,
+                                 TARGET_PAGE_SIZE);
     if (ret == RAM_SAVE_CONTROL_NOT_SUPP) {
         return false;
     }
diff --git a/migration/rdma.c b/migration/rdma.c
index 0b1cb03b2b..f66bd939d7 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -3314,6 +3314,24 @@ err:
     return -1;
 }
 
+int rdma_control_save_page(QEMUFile *f, ram_addr_t block_offset,
+                           ram_addr_t offset, size_t size)
+{
+    if (!migrate_rdma()) {
+        return RAM_SAVE_CONTROL_NOT_SUPP;
+    }
+
+    int ret = qemu_rdma_save_page(f, block_offset, offset, size);
+
+    if (ret != RAM_SAVE_CONTROL_DELAYED &&
+        ret != RAM_SAVE_CONTROL_NOT_SUPP) {
+        if (ret < 0) {
+            qemu_file_set_error(f, ret);
+        }
+    }
+    return ret;
+}
+
 static void rdma_accept_incoming_migration(void *opaque);
 
 static void rdma_cm_poll_handler(void *opaque)
@@ -3989,7 +4007,6 @@ static const QEMUFileHooks rdma_read_hooks = {
 };
 
 static const QEMUFileHooks rdma_write_hooks = {
-    .save_page          = qemu_rdma_save_page,
 };
 
 
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PULL 20/38] qemu-file: Remove QEMUFileHooks
  2023-10-17  8:29 [PULL 00/38] Migration 20231017 patches Juan Quintela
                   ` (18 preceding siblings ...)
  2023-10-17  8:29 ` [PULL 19/38] migration/rdma: Create rdma_control_save_page() Juan Quintela
@ 2023-10-17  8:29 ` Juan Quintela
  2023-10-17  8:29 ` [PULL 21/38] migration/rdma: Move rdma constants from qemu-file.h to rdma.h Juan Quintela
                   ` (18 subsequent siblings)
  38 siblings, 0 replies; 49+ messages in thread
From: Juan Quintela @ 2023-10-17  8:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Thomas Huth, Fabiano Rosas, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Juan Quintela, Laurent Vivier

The only user was rdma, and its use is gone.

Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231011203527.9061-8-quintela@redhat.com>
---
 migration/qemu-file.h | 4 ----
 migration/qemu-file.c | 6 ------
 migration/rdma.c      | 9 ---------
 3 files changed, 19 deletions(-)

diff --git a/migration/qemu-file.h b/migration/qemu-file.h
index 60510a2819..0b22d8335f 100644
--- a/migration/qemu-file.h
+++ b/migration/qemu-file.h
@@ -36,12 +36,8 @@
 #define RAM_CONTROL_ROUND     1
 #define RAM_CONTROL_FINISH    3
 
-typedef struct QEMUFileHooks {
-} QEMUFileHooks;
-
 QEMUFile *qemu_file_new_input(QIOChannel *ioc);
 QEMUFile *qemu_file_new_output(QIOChannel *ioc);
-void qemu_file_set_hooks(QEMUFile *f, const QEMUFileHooks *hooks);
 int qemu_fclose(QEMUFile *f);
 
 /*
diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index 745eaf7a5b..3fb25148d1 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -38,7 +38,6 @@
 #define MAX_IOV_SIZE MIN_CONST(IOV_MAX, 64)
 
 struct QEMUFile {
-    const QEMUFileHooks *hooks;
     QIOChannel *ioc;
     bool is_writable;
 
@@ -133,11 +132,6 @@ QEMUFile *qemu_file_new_input(QIOChannel *ioc)
     return qemu_file_new_impl(ioc, false);
 }
 
-void qemu_file_set_hooks(QEMUFile *f, const QEMUFileHooks *hooks)
-{
-    f->hooks = hooks;
-}
-
 /*
  * Get last error for stream f with optional Error*
  *
diff --git a/migration/rdma.c b/migration/rdma.c
index f66bd939d7..9883b0a250 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -4003,13 +4003,6 @@ err:
     return -1;
 }
 
-static const QEMUFileHooks rdma_read_hooks = {
-};
-
-static const QEMUFileHooks rdma_write_hooks = {
-};
-
-
 static void qio_channel_rdma_finalize(Object *obj)
 {
     QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(obj);
@@ -4061,7 +4054,6 @@ static QEMUFile *rdma_new_input(RDMAContext *rdma)
     rioc->file = qemu_file_new_input(QIO_CHANNEL(rioc));
     rioc->rdmain = rdma;
     rioc->rdmaout = rdma->return_path;
-    qemu_file_set_hooks(rioc->file, &rdma_read_hooks);
 
     return rioc->file;
 }
@@ -4073,7 +4065,6 @@ static QEMUFile *rdma_new_output(RDMAContext *rdma)
     rioc->file = qemu_file_new_output(QIO_CHANNEL(rioc));
     rioc->rdmaout = rdma;
     rioc->rdmain = rdma->return_path;
-    qemu_file_set_hooks(rioc->file, &rdma_write_hooks);
 
     return rioc->file;
 }
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PULL 21/38] migration/rdma: Move rdma constants from qemu-file.h to rdma.h
  2023-10-17  8:29 [PULL 00/38] Migration 20231017 patches Juan Quintela
                   ` (19 preceding siblings ...)
  2023-10-17  8:29 ` [PULL 20/38] qemu-file: Remove QEMUFileHooks Juan Quintela
@ 2023-10-17  8:29 ` Juan Quintela
  2023-10-17  8:29 ` [PULL 22/38] migration/rdma: Remove qemu_ prefix from exported functions Juan Quintela
                   ` (17 subsequent siblings)
  38 siblings, 0 replies; 49+ messages in thread
From: Juan Quintela @ 2023-10-17  8:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Thomas Huth, Fabiano Rosas, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Juan Quintela, Laurent Vivier

Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231011203527.9061-9-quintela@redhat.com>
---
 migration/qemu-file.h | 17 -----------------
 migration/rdma.h      | 16 ++++++++++++++++
 migration/ram.c       |  2 +-
 3 files changed, 17 insertions(+), 18 deletions(-)

diff --git a/migration/qemu-file.h b/migration/qemu-file.h
index 0b22d8335f..a29c37b0d0 100644
--- a/migration/qemu-file.h
+++ b/migration/qemu-file.h
@@ -29,13 +29,6 @@
 #include "exec/cpu-common.h"
 #include "io/channel.h"
 
-/*
- * Constants used by ram_control_* hooks
- */
-#define RAM_CONTROL_SETUP     0
-#define RAM_CONTROL_ROUND     1
-#define RAM_CONTROL_FINISH    3
-
 QEMUFile *qemu_file_new_input(QIOChannel *ioc);
 QEMUFile *qemu_file_new_output(QIOChannel *ioc);
 int qemu_fclose(QEMUFile *f);
@@ -101,16 +94,6 @@ void qemu_fflush(QEMUFile *f);
 void qemu_file_set_blocking(QEMUFile *f, bool block);
 int qemu_file_get_to_fd(QEMUFile *f, int fd, size_t size);
 
-/* Whenever this is found in the data stream, the flags
- * will be passed to ram_control_load_hook in the incoming-migration
- * side. This lets before_ram_iterate/after_ram_iterate add
- * transport-specific sections to the RAM migration data.
- */
-#define RAM_SAVE_FLAG_HOOK     0x80
-
-#define RAM_SAVE_CONTROL_NOT_SUPP -1000
-#define RAM_SAVE_CONTROL_DELAYED  -2000
-
 QIOChannel *qemu_file_get_ioc(QEMUFile *file);
 
 #endif
diff --git a/migration/rdma.h b/migration/rdma.h
index 09a16c1e3c..1ff3718a76 100644
--- a/migration/rdma.h
+++ b/migration/rdma.h
@@ -24,6 +24,22 @@ void rdma_start_outgoing_migration(void *opaque, const char *host_port,
 
 void rdma_start_incoming_migration(const char *host_port, Error **errp);
 
+/*
+ * Constants used by rdma return codes
+ */
+#define RAM_CONTROL_SETUP     0
+#define RAM_CONTROL_ROUND     1
+#define RAM_CONTROL_FINISH    3
+
+/*
+ * Whenever this is found in the data stream, the flags
+ * will be passed to rdma functions in the incoming-migration
+ * side.
+ */
+#define RAM_SAVE_FLAG_HOOK     0x80
+
+#define RAM_SAVE_CONTROL_NOT_SUPP -1000
+#define RAM_SAVE_CONTROL_DELAYED  -2000
 
 #ifdef CONFIG_RDMA
 int qemu_rdma_registration_handle(QEMUFile *f);
diff --git a/migration/ram.c b/migration/ram.c
index 3b4b09f6ff..6a4aed2a75 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -89,7 +89,7 @@
 #define RAM_SAVE_FLAG_EOS      0x10
 #define RAM_SAVE_FLAG_CONTINUE 0x20
 #define RAM_SAVE_FLAG_XBZRLE   0x40
-/* 0x80 is reserved in qemu-file.h for RAM_SAVE_FLAG_HOOK */
+/* 0x80 is reserved in rdma.h for RAM_SAVE_FLAG_HOOK */
 #define RAM_SAVE_FLAG_COMPRESS_PAGE    0x100
 #define RAM_SAVE_FLAG_MULTIFD_FLUSH    0x200
 /* We can't use any flag that is bigger than 0x200 */
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PULL 22/38] migration/rdma: Remove qemu_ prefix from exported functions
  2023-10-17  8:29 [PULL 00/38] Migration 20231017 patches Juan Quintela
                   ` (20 preceding siblings ...)
  2023-10-17  8:29 ` [PULL 21/38] migration/rdma: Move rdma constants from qemu-file.h to rdma.h Juan Quintela
@ 2023-10-17  8:29 ` Juan Quintela
  2023-10-17  8:29 ` [PULL 23/38] migration/rdma: Check sooner if we are in postcopy for save_page() Juan Quintela
                   ` (16 subsequent siblings)
  38 siblings, 0 replies; 49+ messages in thread
From: Juan Quintela @ 2023-10-17  8:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Thomas Huth, Fabiano Rosas, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Juan Quintela, Laurent Vivier

Functions are long enough even without this.

Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231011203527.9061-10-quintela@redhat.com>
---
 migration/rdma.h       | 12 ++++++------
 migration/ram.c        | 14 +++++++-------
 migration/rdma.c       | 40 +++++++++++++++++++---------------------
 migration/trace-events | 28 ++++++++++++++--------------
 4 files changed, 46 insertions(+), 48 deletions(-)

diff --git a/migration/rdma.h b/migration/rdma.h
index 1ff3718a76..30b15b4466 100644
--- a/migration/rdma.h
+++ b/migration/rdma.h
@@ -42,19 +42,19 @@ void rdma_start_incoming_migration(const char *host_port, Error **errp);
 #define RAM_SAVE_CONTROL_DELAYED  -2000
 
 #ifdef CONFIG_RDMA
-int qemu_rdma_registration_handle(QEMUFile *f);
-int qemu_rdma_registration_start(QEMUFile *f, uint64_t flags);
-int qemu_rdma_registration_stop(QEMUFile *f, uint64_t flags);
+int rdma_registration_handle(QEMUFile *f);
+int rdma_registration_start(QEMUFile *f, uint64_t flags);
+int rdma_registration_stop(QEMUFile *f, uint64_t flags);
 int rdma_block_notification_handle(QEMUFile *f, const char *name);
 int rdma_control_save_page(QEMUFile *f, ram_addr_t block_offset,
                            ram_addr_t offset, size_t size);
 #else
 static inline
-int qemu_rdma_registration_handle(QEMUFile *f) { return 0; }
+int rdma_registration_handle(QEMUFile *f) { return 0; }
 static inline
-int qemu_rdma_registration_start(QEMUFile *f, uint64_t flags) { return 0; }
+int rdma_registration_start(QEMUFile *f, uint64_t flags) { return 0; }
 static inline
-int qemu_rdma_registration_stop(QEMUFile *f, uint64_t flags) { return 0; }
+int rdma_registration_stop(QEMUFile *f, uint64_t flags) { return 0; }
 static inline
 int rdma_block_notification_handle(QEMUFile *f, const char *name) { return 0; }
 static inline
diff --git a/migration/ram.c b/migration/ram.c
index 6a4aed2a75..a9bc6ae1f1 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -3061,12 +3061,12 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
         }
     }
 
-    ret = qemu_rdma_registration_start(f, RAM_CONTROL_SETUP);
+    ret = rdma_registration_start(f, RAM_CONTROL_SETUP);
     if (ret < 0) {
         qemu_file_set_error(f, ret);
     }
 
-    ret = qemu_rdma_registration_stop(f, RAM_CONTROL_SETUP);
+    ret = rdma_registration_stop(f, RAM_CONTROL_SETUP);
     if (ret < 0) {
         qemu_file_set_error(f, ret);
     }
@@ -3131,7 +3131,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
         /* Read version before ram_list.blocks */
         smp_rmb();
 
-        ret = qemu_rdma_registration_start(f, RAM_CONTROL_ROUND);
+        ret = rdma_registration_start(f, RAM_CONTROL_ROUND);
         if (ret < 0) {
             qemu_file_set_error(f, ret);
         }
@@ -3191,7 +3191,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
      * Must occur before EOS (or any QEMUFile operation)
      * because of RDMA protocol.
      */
-    ret = qemu_rdma_registration_stop(f, RAM_CONTROL_ROUND);
+    ret = rdma_registration_stop(f, RAM_CONTROL_ROUND);
     if (ret < 0) {
         qemu_file_set_error(f, ret);
     }
@@ -3242,7 +3242,7 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
             migration_bitmap_sync_precopy(rs, true);
         }
 
-        ret = qemu_rdma_registration_start(f, RAM_CONTROL_FINISH);
+        ret = rdma_registration_start(f, RAM_CONTROL_FINISH);
         if (ret < 0) {
             qemu_file_set_error(f, ret);
         }
@@ -3268,7 +3268,7 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
 
         ram_flush_compressed_data(rs);
 
-        int ret = qemu_rdma_registration_stop(f, RAM_CONTROL_FINISH);
+        int ret = rdma_registration_stop(f, RAM_CONTROL_FINISH);
         if (ret < 0) {
             qemu_file_set_error(f, ret);
         }
@@ -4077,7 +4077,7 @@ static int ram_load_precopy(QEMUFile *f)
             }
             break;
         case RAM_SAVE_FLAG_HOOK:
-            ret = qemu_rdma_registration_handle(f);
+            ret = rdma_registration_handle(f);
             if (ret < 0) {
                 qemu_file_set_error(f, ret);
             }
diff --git a/migration/rdma.c b/migration/rdma.c
index 9883b0a250..c147c94b08 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -3540,7 +3540,7 @@ static int dest_ram_sort_func(const void *a, const void *b)
  *
  * Keep doing this until the source tells us to stop.
  */
-int qemu_rdma_registration_handle(QEMUFile *f)
+int rdma_registration_handle(QEMUFile *f)
 {
     RDMAControlHeader reg_resp = { .len = sizeof(RDMARegisterResult),
                                .type = RDMA_CONTROL_REGISTER_RESULT,
@@ -3586,7 +3586,7 @@ int qemu_rdma_registration_handle(QEMUFile *f)
 
     local = &rdma->local_ram_blocks;
     do {
-        trace_qemu_rdma_registration_handle_wait();
+        trace_rdma_registration_handle_wait();
 
         ret = qemu_rdma_exchange_recv(rdma, &head, RDMA_CONTROL_NONE, &err);
 
@@ -3606,9 +3606,9 @@ int qemu_rdma_registration_handle(QEMUFile *f)
             comp = (RDMACompress *) rdma->wr_data[idx].control_curr;
             network_to_compress(comp);
 
-            trace_qemu_rdma_registration_handle_compress(comp->length,
-                                                         comp->block_idx,
-                                                         comp->offset);
+            trace_rdma_registration_handle_compress(comp->length,
+                                                    comp->block_idx,
+                                                    comp->offset);
             if (comp->block_idx >= rdma->local_ram_blocks.nb_blocks) {
                 error_report("rdma: 'compress' bad block index %u (vs %d)",
                              (unsigned int)comp->block_idx,
@@ -3624,11 +3624,11 @@ int qemu_rdma_registration_handle(QEMUFile *f)
             break;
 
         case RDMA_CONTROL_REGISTER_FINISHED:
-            trace_qemu_rdma_registration_handle_finished();
+            trace_rdma_registration_handle_finished();
             return 0;
 
         case RDMA_CONTROL_RAM_BLOCKS_REQUEST:
-            trace_qemu_rdma_registration_handle_ram_blocks();
+            trace_rdma_registration_handle_ram_blocks();
 
             /* Sort our local RAM Block list so it's the same as the source,
              * we can do this since we've filled in a src_index in the list
@@ -3667,7 +3667,7 @@ int qemu_rdma_registration_handle(QEMUFile *f)
                 rdma->dest_blocks[i].length = local->block[i].length;
 
                 dest_block_to_network(&rdma->dest_blocks[i]);
-                trace_qemu_rdma_registration_handle_ram_blocks_loop(
+                trace_rdma_registration_handle_ram_blocks_loop(
                     local->block[i].block_name,
                     local->block[i].offset,
                     local->block[i].length,
@@ -3690,7 +3690,7 @@ int qemu_rdma_registration_handle(QEMUFile *f)
 
             break;
         case RDMA_CONTROL_REGISTER_REQUEST:
-            trace_qemu_rdma_registration_handle_register(head.repeat);
+            trace_rdma_registration_handle_register(head.repeat);
 
             reg_resp.repeat = head.repeat;
             registers = (RDMARegister *) rdma->wr_data[idx].control_curr;
@@ -3704,7 +3704,7 @@ int qemu_rdma_registration_handle(QEMUFile *f)
 
                 reg_result = &results[count];
 
-                trace_qemu_rdma_registration_handle_register_loop(count,
+                trace_rdma_registration_handle_register_loop(count,
                          reg->current_index, reg->key.current_addr, reg->chunks);
 
                 if (reg->current_index >= rdma->local_ram_blocks.nb_blocks) {
@@ -3752,8 +3752,7 @@ int qemu_rdma_registration_handle(QEMUFile *f)
 
                 reg_result->host_addr = (uintptr_t)block->local_host_addr;
 
-                trace_qemu_rdma_registration_handle_register_rkey(
-                                                           reg_result->rkey);
+                trace_rdma_registration_handle_register_rkey(reg_result->rkey);
 
                 result_to_network(reg_result);
             }
@@ -3767,7 +3766,7 @@ int qemu_rdma_registration_handle(QEMUFile *f)
             }
             break;
         case RDMA_CONTROL_UNREGISTER_REQUEST:
-            trace_qemu_rdma_registration_handle_unregister(head.repeat);
+            trace_rdma_registration_handle_unregister(head.repeat);
             unreg_resp.repeat = head.repeat;
             registers = (RDMARegister *) rdma->wr_data[idx].control_curr;
 
@@ -3775,7 +3774,7 @@ int qemu_rdma_registration_handle(QEMUFile *f)
                 reg = &registers[count];
                 network_to_register(reg);
 
-                trace_qemu_rdma_registration_handle_unregister_loop(count,
+                trace_rdma_registration_handle_unregister_loop(count,
                            reg->current_index, reg->key.chunk);
 
                 block = &(rdma->local_ram_blocks.block[reg->current_index]);
@@ -3791,8 +3790,7 @@ int qemu_rdma_registration_handle(QEMUFile *f)
 
                 rdma->total_registrations--;
 
-                trace_qemu_rdma_registration_handle_unregister_success(
-                                                       reg->key.chunk);
+                trace_rdma_registration_handle_unregister_success(reg->key.chunk);
             }
 
             ret = qemu_rdma_post_send_control(rdma, NULL, &unreg_resp, &err);
@@ -3859,7 +3857,7 @@ int rdma_block_notification_handle(QEMUFile *f, const char *name)
     return 0;
 }
 
-int qemu_rdma_registration_start(QEMUFile *f, uint64_t flags)
+int rdma_registration_start(QEMUFile *f, uint64_t flags)
 {
     if (!migrate_rdma() || migration_in_postcopy()) {
         return 0;
@@ -3876,7 +3874,7 @@ int qemu_rdma_registration_start(QEMUFile *f, uint64_t flags)
         return -1;
     }
 
-    trace_qemu_rdma_registration_start(flags);
+    trace_rdma_registration_start(flags);
     qemu_put_be64(f, RAM_SAVE_FLAG_HOOK);
     qemu_fflush(f);
 
@@ -3887,7 +3885,7 @@ int qemu_rdma_registration_start(QEMUFile *f, uint64_t flags)
  * Inform dest that dynamic registrations are done for now.
  * First, flush writes, if any.
  */
-int qemu_rdma_registration_stop(QEMUFile *f, uint64_t flags)
+int rdma_registration_stop(QEMUFile *f, uint64_t flags)
 {
     QIOChannelRDMA *rioc;
     Error *err = NULL;
@@ -3923,7 +3921,7 @@ int qemu_rdma_registration_stop(QEMUFile *f, uint64_t flags)
         int reg_result_idx, i, nb_dest_blocks;
 
         head.type = RDMA_CONTROL_RAM_BLOCKS_REQUEST;
-        trace_qemu_rdma_registration_stop_ram();
+        trace_rdma_registration_stop_ram();
 
         /*
          * Make sure that we parallelize the pinning on both sides.
@@ -3987,7 +3985,7 @@ int qemu_rdma_registration_stop(QEMUFile *f, uint64_t flags)
         }
     }
 
-    trace_qemu_rdma_registration_stop(flags);
+    trace_rdma_registration_stop(flags);
 
     head.type = RDMA_CONTROL_REGISTER_FINISHED;
     ret = qemu_rdma_exchange_send(rdma, &head, NULL, NULL, NULL, NULL, &err);
diff --git a/migration/trace-events b/migration/trace-events
index d8c2aa846d..403cc1ae11 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -231,20 +231,6 @@ qemu_rdma_post_send_control(const char *desc) "CONTROL: sending %s.."
 qemu_rdma_register_and_get_keys(uint64_t len, void *start) "Registering %" PRIu64 " bytes @ %p"
 qemu_rdma_register_odp_mr(const char *name) "Try to register On-Demand Paging memory region: %s"
 qemu_rdma_advise_mr(const char *name, uint32_t len, uint64_t addr, const char *res) "Try to advise block %s prefetch at %" PRIu32 "@0x%" PRIx64 ": %s"
-qemu_rdma_registration_handle_compress(int64_t length, int index, int64_t offset) "Zapping zero chunk: %" PRId64 " bytes, index %d, offset %" PRId64
-qemu_rdma_registration_handle_finished(void) ""
-qemu_rdma_registration_handle_ram_blocks(void) ""
-qemu_rdma_registration_handle_ram_blocks_loop(const char *name, uint64_t offset, uint64_t length, void *local_host_addr, unsigned int src_index) "%s: @0x%" PRIx64 "/%" PRIu64 " host:@%p src_index: %u"
-qemu_rdma_registration_handle_register(int requests) "%d requests"
-qemu_rdma_registration_handle_register_loop(int req, int index, uint64_t addr, uint64_t chunks) "Registration request (%d): index %d, current_addr %" PRIu64 " chunks: %" PRIu64
-qemu_rdma_registration_handle_register_rkey(int rkey) "0x%x"
-qemu_rdma_registration_handle_unregister(int requests) "%d requests"
-qemu_rdma_registration_handle_unregister_loop(int count, int index, uint64_t chunk) "Unregistration request (%d): index %d, chunk %" PRIu64
-qemu_rdma_registration_handle_unregister_success(uint64_t chunk) "%" PRIu64
-qemu_rdma_registration_handle_wait(void) ""
-qemu_rdma_registration_start(uint64_t flags) "%" PRIu64
-qemu_rdma_registration_stop(uint64_t flags) "%" PRIu64
-qemu_rdma_registration_stop_ram(void) ""
 qemu_rdma_resolve_host_trying(const char *host, const char *ip) "Trying %s => %s"
 qemu_rdma_signal_unregister_append(uint64_t chunk, int pos) "Appending unregister chunk %" PRIu64 " at position %d"
 qemu_rdma_signal_unregister_already(uint64_t chunk) "Unregister chunk %" PRIu64 " already in queue"
@@ -263,6 +249,20 @@ qemu_rdma_write_one_zero(uint64_t chunk, int len, int index, int64_t offset) "En
 rdma_add_block(const char *block_name, int block, uint64_t addr, uint64_t offset, uint64_t len, uint64_t end, uint64_t bits, int chunks) "Added Block: '%s':%d, addr: %" PRIu64 ", offset: %" PRIu64 " length: %" PRIu64 " end: %" PRIu64 " bits %" PRIu64 " chunks %d"
 rdma_block_notification_handle(const char *name, int index) "%s at %d"
 rdma_delete_block(void *block, uint64_t addr, uint64_t offset, uint64_t len, uint64_t end, uint64_t bits, int chunks) "Deleted Block: %p, addr: %" PRIu64 ", offset: %" PRIu64 " length: %" PRIu64 " end: %" PRIu64 " bits %" PRIu64 " chunks %d"
+rdma_registration_handle_compress(int64_t length, int index, int64_t offset) "Zapping zero chunk: %" PRId64 " bytes, index %d, offset %" PRId64
+rdma_registration_handle_finished(void) ""
+rdma_registration_handle_ram_blocks(void) ""
+rdma_registration_handle_ram_blocks_loop(const char *name, uint64_t offset, uint64_t length, void *local_host_addr, unsigned int src_index) "%s: @0x%" PRIx64 "/%" PRIu64 " host:@%p src_index: %u"
+rdma_registration_handle_register(int requests) "%d requests"
+rdma_registration_handle_register_loop(int req, int index, uint64_t addr, uint64_t chunks) "Registration request (%d): index %d, current_addr %" PRIu64 " chunks: %" PRIu64
+rdma_registration_handle_register_rkey(int rkey) "0x%x"
+rdma_registration_handle_unregister(int requests) "%d requests"
+rdma_registration_handle_unregister_loop(int count, int index, uint64_t chunk) "Unregistration request (%d): index %d, chunk %" PRIu64
+rdma_registration_handle_unregister_success(uint64_t chunk) "%" PRIu64
+rdma_registration_handle_wait(void) ""
+rdma_registration_start(uint64_t flags) "%" PRIu64
+rdma_registration_stop(uint64_t flags) "%" PRIu64
+rdma_registration_stop_ram(void) ""
 rdma_start_incoming_migration(void) ""
 rdma_start_incoming_migration_after_dest_init(void) ""
 rdma_start_incoming_migration_after_rdma_listen(void) ""
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PULL 23/38] migration/rdma: Check sooner if we are in postcopy for save_page()
  2023-10-17  8:29 [PULL 00/38] Migration 20231017 patches Juan Quintela
                   ` (21 preceding siblings ...)
  2023-10-17  8:29 ` [PULL 22/38] migration/rdma: Remove qemu_ prefix from exported functions Juan Quintela
@ 2023-10-17  8:29 ` Juan Quintela
  2023-10-17  8:29 ` [PULL 24/38] migration/rdma: Use i as for index instead of idx Juan Quintela
                   ` (15 subsequent siblings)
  38 siblings, 0 replies; 49+ messages in thread
From: Juan Quintela @ 2023-10-17  8:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Thomas Huth, Fabiano Rosas, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Juan Quintela, Laurent Vivier

Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231011203527.9061-11-quintela@redhat.com>
---
 migration/rdma.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/migration/rdma.c b/migration/rdma.c
index c147c94b08..e973579a52 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -3240,10 +3240,6 @@ static int qemu_rdma_save_page(QEMUFile *f, ram_addr_t block_offset,
     RDMAContext *rdma;
     int ret;
 
-    if (migration_in_postcopy()) {
-        return RAM_SAVE_CONTROL_NOT_SUPP;
-    }
-
     RCU_READ_LOCK_GUARD();
     rdma = qatomic_rcu_read(&rioc->rdmaout);
 
@@ -3317,7 +3313,7 @@ err:
 int rdma_control_save_page(QEMUFile *f, ram_addr_t block_offset,
                            ram_addr_t offset, size_t size)
 {
-    if (!migrate_rdma()) {
+    if (!migrate_rdma() || migration_in_postcopy()) {
         return RAM_SAVE_CONTROL_NOT_SUPP;
     }
 
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PULL 24/38] migration/rdma: Use i as for index instead of idx
  2023-10-17  8:29 [PULL 00/38] Migration 20231017 patches Juan Quintela
                   ` (22 preceding siblings ...)
  2023-10-17  8:29 ` [PULL 23/38] migration/rdma: Check sooner if we are in postcopy for save_page() Juan Quintela
@ 2023-10-17  8:29 ` Juan Quintela
  2023-10-17  8:29 ` [PULL 25/38] migration/rdma: Declare for index variables local Juan Quintela
                   ` (14 subsequent siblings)
  38 siblings, 0 replies; 49+ messages in thread
From: Juan Quintela @ 2023-10-17  8:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Thomas Huth, Fabiano Rosas, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Juan Quintela, Laurent Vivier

Once there, all the uses are local to the for, so declare the variable
inside the for statement.

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231011203527.9061-12-quintela@redhat.com>
---
 migration/rdma.c | 49 ++++++++++++++++++++++--------------------------
 1 file changed, 22 insertions(+), 27 deletions(-)

diff --git a/migration/rdma.c b/migration/rdma.c
index e973579a52..5f6a771e8e 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -2354,7 +2354,6 @@ static int qemu_rdma_write(RDMAContext *rdma,
 static void qemu_rdma_cleanup(RDMAContext *rdma)
 {
     Error *err = NULL;
-    int idx;
 
     if (rdma->cm_id && rdma->connected) {
         if ((rdma->errored ||
@@ -2381,12 +2380,12 @@ static void qemu_rdma_cleanup(RDMAContext *rdma)
     g_free(rdma->dest_blocks);
     rdma->dest_blocks = NULL;
 
-    for (idx = 0; idx < RDMA_WRID_MAX; idx++) {
-        if (rdma->wr_data[idx].control_mr) {
+    for (int i = 0; i < RDMA_WRID_MAX; i++) {
+        if (rdma->wr_data[i].control_mr) {
             rdma->total_registrations--;
-            ibv_dereg_mr(rdma->wr_data[idx].control_mr);
+            ibv_dereg_mr(rdma->wr_data[i].control_mr);
         }
-        rdma->wr_data[idx].control_mr = NULL;
+        rdma->wr_data[i].control_mr = NULL;
     }
 
     if (rdma->local_ram_blocks.block) {
@@ -2452,7 +2451,7 @@ static void qemu_rdma_cleanup(RDMAContext *rdma)
 
 static int qemu_rdma_source_init(RDMAContext *rdma, bool pin_all, Error **errp)
 {
-    int ret, idx;
+    int ret;
 
     /*
      * Will be validated against destination's actual capabilities
@@ -2480,18 +2479,17 @@ static int qemu_rdma_source_init(RDMAContext *rdma, bool pin_all, Error **errp)
 
     /* Build the hash that maps from offset to RAMBlock */
     rdma->blockmap = g_hash_table_new(g_direct_hash, g_direct_equal);
-    for (idx = 0; idx < rdma->local_ram_blocks.nb_blocks; idx++) {
+    for (int i = 0; i < rdma->local_ram_blocks.nb_blocks; i++) {
         g_hash_table_insert(rdma->blockmap,
-                (void *)(uintptr_t)rdma->local_ram_blocks.block[idx].offset,
-                &rdma->local_ram_blocks.block[idx]);
+                (void *)(uintptr_t)rdma->local_ram_blocks.block[i].offset,
+                &rdma->local_ram_blocks.block[i]);
     }
 
-    for (idx = 0; idx < RDMA_WRID_MAX; idx++) {
-        ret = qemu_rdma_reg_control(rdma, idx);
+    for (int i = 0; i < RDMA_WRID_MAX; i++) {
+        ret = qemu_rdma_reg_control(rdma, i);
         if (ret < 0) {
-            error_setg(errp,
-                       "RDMA ERROR: rdma migration: error registering %d control!",
-                       idx);
+            error_setg(errp, "RDMA ERROR: rdma migration: error "
+                       "registering %d control!", i);
             goto err_rdma_source_init;
         }
     }
@@ -2625,16 +2623,16 @@ err_rdma_source_connect:
 static int qemu_rdma_dest_init(RDMAContext *rdma, Error **errp)
 {
     Error *err = NULL;
-    int ret, idx;
+    int ret;
     struct rdma_cm_id *listen_id;
     char ip[40] = "unknown";
     struct rdma_addrinfo *res, *e;
     char port_str[16];
     int reuse = 1;
 
-    for (idx = 0; idx < RDMA_WRID_MAX; idx++) {
-        rdma->wr_data[idx].control_len = 0;
-        rdma->wr_data[idx].control_curr = NULL;
+    for (int i = 0; i < RDMA_WRID_MAX; i++) {
+        rdma->wr_data[i].control_len = 0;
+        rdma->wr_data[i].control_curr = NULL;
     }
 
     if (!rdma->host || !rdma->host[0]) {
@@ -2723,11 +2721,9 @@ err_dest_init_create_listen_id:
 static void qemu_rdma_return_path_dest_init(RDMAContext *rdma_return_path,
                                             RDMAContext *rdma)
 {
-    int idx;
-
-    for (idx = 0; idx < RDMA_WRID_MAX; idx++) {
-        rdma_return_path->wr_data[idx].control_len = 0;
-        rdma_return_path->wr_data[idx].control_curr = NULL;
+    for (int i = 0; i < RDMA_WRID_MAX; i++) {
+        rdma_return_path->wr_data[i].control_len = 0;
+        rdma_return_path->wr_data[i].control_curr = NULL;
     }
 
     /*the CM channel and CM id is shared*/
@@ -3376,7 +3372,6 @@ static int qemu_rdma_accept(RDMAContext *rdma)
     struct rdma_cm_event *cm_event;
     struct ibv_context *verbs;
     int ret;
-    int idx;
 
     ret = rdma_get_cm_event(rdma->channel, &cm_event);
     if (ret < 0) {
@@ -3462,10 +3457,10 @@ static int qemu_rdma_accept(RDMAContext *rdma)
 
     qemu_rdma_init_ram_blocks(rdma);
 
-    for (idx = 0; idx < RDMA_WRID_MAX; idx++) {
-        ret = qemu_rdma_reg_control(rdma, idx);
+    for (int i = 0; i < RDMA_WRID_MAX; i++) {
+        ret = qemu_rdma_reg_control(rdma, i);
         if (ret < 0) {
-            error_report("rdma: error registering %d control", idx);
+            error_report("rdma: error registering %d control", i);
             goto err_rdma_dest_wait;
         }
     }
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PULL 25/38] migration/rdma: Declare for index variables local
  2023-10-17  8:29 [PULL 00/38] Migration 20231017 patches Juan Quintela
                   ` (23 preceding siblings ...)
  2023-10-17  8:29 ` [PULL 24/38] migration/rdma: Use i as for index instead of idx Juan Quintela
@ 2023-10-17  8:29 ` Juan Quintela
  2023-10-17  8:29 ` [PULL 26/38] migration/rdma: Remove all "ret" variables that are used only once Juan Quintela
                   ` (13 subsequent siblings)
  38 siblings, 0 replies; 49+ messages in thread
From: Juan Quintela @ 2023-10-17  8:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Thomas Huth, Fabiano Rosas, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Juan Quintela, Laurent Vivier

Declare all variables that are only used inside a for loop inside the
for statement.

This makes clear that they are not used outside of the for loop.

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231011203527.9061-13-quintela@redhat.com>
---
 migration/rdma.c | 44 ++++++++++++++++++--------------------------
 1 file changed, 18 insertions(+), 26 deletions(-)

diff --git a/migration/rdma.c b/migration/rdma.c
index 5f6a771e8e..09015fbd1a 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -559,10 +559,8 @@ static void rdma_add_block(RDMAContext *rdma, const char *block_name,
     local->block = g_new0(RDMALocalBlock, local->nb_blocks + 1);
 
     if (local->nb_blocks) {
-        int x;
-
         if (rdma->blockmap) {
-            for (x = 0; x < local->nb_blocks; x++) {
+            for (int x = 0; x < local->nb_blocks; x++) {
                 g_hash_table_remove(rdma->blockmap,
                                     (void *)(uintptr_t)old[x].offset);
                 g_hash_table_insert(rdma->blockmap,
@@ -649,15 +647,12 @@ static void rdma_delete_block(RDMAContext *rdma, RDMALocalBlock *block)
 {
     RDMALocalBlocks *local = &rdma->local_ram_blocks;
     RDMALocalBlock *old = local->block;
-    int x;
 
     if (rdma->blockmap) {
         g_hash_table_remove(rdma->blockmap, (void *)(uintptr_t)block->offset);
     }
     if (block->pmr) {
-        int j;
-
-        for (j = 0; j < block->nb_chunks; j++) {
+        for (int j = 0; j < block->nb_chunks; j++) {
             if (!block->pmr[j]) {
                 continue;
             }
@@ -687,7 +682,7 @@ static void rdma_delete_block(RDMAContext *rdma, RDMALocalBlock *block)
     block->block_name = NULL;
 
     if (rdma->blockmap) {
-        for (x = 0; x < local->nb_blocks; x++) {
+        for (int x = 0; x < local->nb_blocks; x++) {
             g_hash_table_remove(rdma->blockmap,
                                 (void *)(uintptr_t)old[x].offset);
         }
@@ -705,7 +700,7 @@ static void rdma_delete_block(RDMAContext *rdma, RDMALocalBlock *block)
             memcpy(local->block + block->index, old + (block->index + 1),
                 sizeof(RDMALocalBlock) *
                     (local->nb_blocks - (block->index + 1)));
-            for (x = block->index; x < local->nb_blocks - 1; x++) {
+            for (int x = block->index; x < local->nb_blocks - 1; x++) {
                 local->block[x].index--;
             }
         }
@@ -725,7 +720,7 @@ static void rdma_delete_block(RDMAContext *rdma, RDMALocalBlock *block)
     local->nb_blocks--;
 
     if (local->nb_blocks && rdma->blockmap) {
-        for (x = 0; x < local->nb_blocks; x++) {
+        for (int x = 0; x < local->nb_blocks; x++) {
             g_hash_table_insert(rdma->blockmap,
                                 (void *)(uintptr_t)local->block[x].offset,
                                 &local->block[x]);
@@ -828,12 +823,12 @@ static int qemu_rdma_broken_ipv6_kernel(struct ibv_context *verbs, Error **errp)
      * Otherwise, there are no guarantees until the bug is fixed in linux.
      */
     if (!verbs) {
-        int num_devices, x;
+        int num_devices;
         struct ibv_device **dev_list = ibv_get_device_list(&num_devices);
         bool roce_found = false;
         bool ib_found = false;
 
-        for (x = 0; x < num_devices; x++) {
+        for (int x = 0; x < num_devices; x++) {
             verbs = ibv_open_device(dev_list[x]);
             /*
              * ibv_open_device() is not documented to set errno.  If
@@ -925,7 +920,6 @@ static int qemu_rdma_resolve_host(RDMAContext *rdma, Error **errp)
     char port_str[16];
     struct rdma_cm_event *cm_event;
     char ip[40] = "unknown";
-    struct rdma_addrinfo *e;
 
     if (rdma->host == NULL || !strcmp(rdma->host, "")) {
         error_setg(errp, "RDMA ERROR: RDMA hostname has not been set");
@@ -957,7 +951,7 @@ static int qemu_rdma_resolve_host(RDMAContext *rdma, Error **errp)
     }
 
     /* Try all addresses, saving the first error in @err */
-    for (e = res; e != NULL; e = e->ai_next) {
+    for (struct rdma_addrinfo *e = res; e != NULL; e = e->ai_next) {
         Error **local_errp = err ? NULL : &err;
 
         inet_ntop(e->ai_family,
@@ -2777,7 +2771,7 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
     RDMAContext *rdma;
     int ret;
     ssize_t done = 0;
-    size_t i, len;
+    size_t len;
 
     RCU_READ_LOCK_GUARD();
     rdma = qatomic_rcu_read(&rioc->rdmaout);
@@ -2803,7 +2797,7 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
         return -1;
     }
 
-    for (i = 0; i < niov; i++) {
+    for (int i = 0; i < niov; i++) {
         size_t remaining = iov[i].iov_len;
         uint8_t * data = (void *)iov[i].iov_base;
         while (remaining) {
@@ -2866,7 +2860,7 @@ static ssize_t qio_channel_rdma_readv(QIOChannel *ioc,
     RDMAControlHeader head;
     int ret;
     ssize_t done = 0;
-    size_t i, len;
+    size_t len;
 
     RCU_READ_LOCK_GUARD();
     rdma = qatomic_rcu_read(&rioc->rdmain);
@@ -2882,7 +2876,7 @@ static ssize_t qio_channel_rdma_readv(QIOChannel *ioc,
         return -1;
     }
 
-    for (i = 0; i < niov; i++) {
+    for (int i = 0; i < niov; i++) {
         size_t want = iov[i].iov_len;
         uint8_t *data = (void *)iov[i].iov_base;
 
@@ -3556,8 +3550,6 @@ int rdma_registration_handle(QEMUFile *f)
     void *host_addr;
     int ret;
     int idx = 0;
-    int count = 0;
-    int i = 0;
 
     if (!migrate_rdma()) {
         return 0;
@@ -3628,7 +3620,7 @@ int rdma_registration_handle(QEMUFile *f)
             qsort(rdma->local_ram_blocks.block,
                   rdma->local_ram_blocks.nb_blocks,
                   sizeof(RDMALocalBlock), dest_ram_sort_func);
-            for (i = 0; i < local->nb_blocks; i++) {
+            for (int i = 0; i < local->nb_blocks; i++) {
                 local->block[i].index = i;
             }
 
@@ -3646,7 +3638,7 @@ int rdma_registration_handle(QEMUFile *f)
              * Both sides use the "remote" structure to communicate and update
              * their "local" descriptions with what was sent.
              */
-            for (i = 0; i < local->nb_blocks; i++) {
+            for (int i = 0; i < local->nb_blocks; i++) {
                 rdma->dest_blocks[i].remote_host_addr =
                     (uintptr_t)(local->block[i].local_host_addr);
 
@@ -3686,7 +3678,7 @@ int rdma_registration_handle(QEMUFile *f)
             reg_resp.repeat = head.repeat;
             registers = (RDMARegister *) rdma->wr_data[idx].control_curr;
 
-            for (count = 0; count < head.repeat; count++) {
+            for (int count = 0; count < head.repeat; count++) {
                 uint64_t chunk;
                 uint8_t *chunk_start, *chunk_end;
 
@@ -3761,7 +3753,7 @@ int rdma_registration_handle(QEMUFile *f)
             unreg_resp.repeat = head.repeat;
             registers = (RDMARegister *) rdma->wr_data[idx].control_curr;
 
-            for (count = 0; count < head.repeat; count++) {
+            for (int count = 0; count < head.repeat; count++) {
                 reg = &registers[count];
                 network_to_register(reg);
 
@@ -3909,7 +3901,7 @@ int rdma_registration_stop(QEMUFile *f, uint64_t flags)
     if (flags == RAM_CONTROL_SETUP) {
         RDMAControlHeader resp = {.type = RDMA_CONTROL_RAM_BLOCKS_RESULT };
         RDMALocalBlocks *local = &rdma->local_ram_blocks;
-        int reg_result_idx, i, nb_dest_blocks;
+        int reg_result_idx, nb_dest_blocks;
 
         head.type = RDMA_CONTROL_RAM_BLOCKS_REQUEST;
         trace_rdma_registration_stop_ram();
@@ -3957,7 +3949,7 @@ int rdma_registration_stop(QEMUFile *f, uint64_t flags)
         qemu_rdma_move_header(rdma, reg_result_idx, &resp);
         memcpy(rdma->dest_blocks,
             rdma->wr_data[reg_result_idx].control_curr, resp.len);
-        for (i = 0; i < nb_dest_blocks; i++) {
+        for (int i = 0; i < nb_dest_blocks; i++) {
             network_to_dest_block(&rdma->dest_blocks[i]);
 
             /* We require that the blocks are in the same order */
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PULL 26/38] migration/rdma: Remove all "ret" variables that are used only once
  2023-10-17  8:29 [PULL 00/38] Migration 20231017 patches Juan Quintela
                   ` (24 preceding siblings ...)
  2023-10-17  8:29 ` [PULL 25/38] migration/rdma: Declare for index variables local Juan Quintela
@ 2023-10-17  8:29 ` Juan Quintela
  2023-10-17  8:29 ` [PULL 27/38] migration: Improve json and formatting Juan Quintela
                   ` (12 subsequent siblings)
  38 siblings, 0 replies; 49+ messages in thread
From: Juan Quintela @ 2023-10-17  8:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Thomas Huth, Fabiano Rosas, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Juan Quintela, Laurent Vivier

Change code that is:

int ret;
...

ret = foo();
if (ret[ < 0]?) {

to:

if (foo()[ < 0]) {

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231011203527.9061-14-quintela@redhat.com>
---
 migration/rdma.c | 29 ++++++++---------------------
 1 file changed, 8 insertions(+), 21 deletions(-)

diff --git a/migration/rdma.c b/migration/rdma.c
index 09015fbd1a..2a1852ec7f 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -1107,7 +1107,6 @@ err_alloc_pd_cq:
 static int qemu_rdma_alloc_qp(RDMAContext *rdma)
 {
     struct ibv_qp_init_attr attr = { 0 };
-    int ret;
 
     attr.cap.max_send_wr = RDMA_SIGNALED_SEND_MAX;
     attr.cap.max_recv_wr = 3;
@@ -1117,8 +1116,7 @@ static int qemu_rdma_alloc_qp(RDMAContext *rdma)
     attr.recv_cq = rdma->recv_cq;
     attr.qp_type = IBV_QPT_RC;
 
-    ret = rdma_create_qp(rdma->cm_id, rdma->pd, &attr);
-    if (ret < 0) {
+    if (rdma_create_qp(rdma->cm_id, rdma->pd, &attr) < 0) {
         return -1;
     }
 
@@ -1130,8 +1128,8 @@ static int qemu_rdma_alloc_qp(RDMAContext *rdma)
 static bool rdma_support_odp(struct ibv_context *dev)
 {
     struct ibv_device_attr_ex attr = {0};
-    int ret = ibv_query_device_ex(dev, NULL, &attr);
-    if (ret) {
+
+    if (ibv_query_device_ex(dev, NULL, &attr)) {
         return false;
     }
 
@@ -1508,7 +1506,6 @@ static int qemu_rdma_wait_comp_channel(RDMAContext *rdma,
                                        struct ibv_comp_channel *comp_channel)
 {
     struct rdma_cm_event *cm_event;
-    int ret;
 
     /*
      * Coroutine doesn't start until migration_fd_process_incoming()
@@ -1544,8 +1541,7 @@ static int qemu_rdma_wait_comp_channel(RDMAContext *rdma,
                 }
 
                 if (pfds[1].revents) {
-                    ret = rdma_get_cm_event(rdma->channel, &cm_event);
-                    if (ret < 0) {
+                    if (rdma_get_cm_event(rdma->channel, &cm_event) < 0) {
                         return -1;
                     }
 
@@ -2317,12 +2313,10 @@ static int qemu_rdma_write(RDMAContext *rdma,
     uint64_t current_addr = block_offset + offset;
     uint64_t index = rdma->current_index;
     uint64_t chunk = rdma->current_chunk;
-    int ret;
 
     /* If we cannot merge it, we flush the current buffer first. */
     if (!qemu_rdma_buffer_mergeable(rdma, current_addr, len)) {
-        ret = qemu_rdma_write_flush(rdma, errp);
-        if (ret < 0) {
+        if (qemu_rdma_write_flush(rdma, errp) < 0) {
             return -1;
         }
         rdma->current_length = 0;
@@ -2936,7 +2930,6 @@ static ssize_t qio_channel_rdma_readv(QIOChannel *ioc,
 static int qemu_rdma_drain_cq(RDMAContext *rdma)
 {
     Error *err = NULL;
-    int ret;
 
     if (qemu_rdma_write_flush(rdma, &err) < 0) {
         error_report_err(err);
@@ -2944,8 +2937,7 @@ static int qemu_rdma_drain_cq(RDMAContext *rdma)
     }
 
     while (rdma->nb_sent) {
-        ret = qemu_rdma_block_for_wrid(rdma, RDMA_WRID_RDMA_WRITE, NULL);
-        if (ret < 0) {
+        if (qemu_rdma_block_for_wrid(rdma, RDMA_WRID_RDMA_WRITE, NULL) < 0) {
             error_report("rdma migration: complete polling error!");
             return -1;
         }
@@ -3323,12 +3315,10 @@ static void rdma_accept_incoming_migration(void *opaque);
 static void rdma_cm_poll_handler(void *opaque)
 {
     RDMAContext *rdma = opaque;
-    int ret;
     struct rdma_cm_event *cm_event;
     MigrationIncomingState *mis = migration_incoming_get_current();
 
-    ret = rdma_get_cm_event(rdma->channel, &cm_event);
-    if (ret < 0) {
+    if (rdma_get_cm_event(rdma->channel, &cm_event) < 0) {
         error_report("get_cm_event failed %d", errno);
         return;
     }
@@ -4053,14 +4043,11 @@ static QEMUFile *rdma_new_output(RDMAContext *rdma)
 static void rdma_accept_incoming_migration(void *opaque)
 {
     RDMAContext *rdma = opaque;
-    int ret;
     QEMUFile *f;
     Error *local_err = NULL;
 
     trace_qemu_rdma_accept_incoming_migration();
-    ret = qemu_rdma_accept(rdma);
-
-    if (ret < 0) {
+    if (qemu_rdma_accept(rdma) < 0) {
         error_report("RDMA ERROR: Migration initialization failed");
         return;
     }
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PULL 27/38] migration: Improve json and formatting
  2023-10-17  8:29 [PULL 00/38] Migration 20231017 patches Juan Quintela
                   ` (25 preceding siblings ...)
  2023-10-17  8:29 ` [PULL 26/38] migration/rdma: Remove all "ret" variables that are used only once Juan Quintela
@ 2023-10-17  8:29 ` Juan Quintela
  2023-10-17  8:29 ` [PULL 28/38] migration: check for rate_limit_max for RATE_LIMIT_DISABLED Juan Quintela
                   ` (11 subsequent siblings)
  38 siblings, 0 replies; 49+ messages in thread
From: Juan Quintela @ 2023-10-17  8:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Thomas Huth, Fabiano Rosas, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Juan Quintela, Laurent Vivier

Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231013104736.31722-2-quintela@redhat.com>
---
 qapi/migration.json | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/qapi/migration.json b/qapi/migration.json
index 360e609f66..db3df12d6c 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -73,7 +73,7 @@
 { 'struct': 'MigrationStats',
   'data': {'transferred': 'int', 'remaining': 'int', 'total': 'int' ,
            'duplicate': 'int',
-           'skipped': { 'type': 'int', 'features': ['deprecated'] },
+           'skipped': { 'type': 'int', 'features': [ 'deprecated' ] },
            'normal': 'int',
            'normal-bytes': 'int', 'dirty-pages-rate': 'int',
            'mbps': 'number', 'dirty-sync-count': 'int',
@@ -440,10 +440,9 @@
 #     compress and xbzrle are both on, compress only takes effect in
 #     the ram bulk stage, after that, it will be disabled and only
 #     xbzrle takes effect, this can help to minimize migration
-#     traffic.  The feature is disabled by default.  (since 2.4 )
+#     traffic.  The feature is disabled by default.  (since 2.4)
 #
-# @events: generate events for each migration state change (since 2.4
-#     )
+# @events: generate events for each migration state change (since 2.4)
 #
 # @auto-converge: If enabled, QEMU will automatically throttle down
 #     the guest to speed up convergence of RAM migration.  (since 1.6)
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PULL 28/38] migration: check for rate_limit_max for RATE_LIMIT_DISABLED
  2023-10-17  8:29 [PULL 00/38] Migration 20231017 patches Juan Quintela
                   ` (26 preceding siblings ...)
  2023-10-17  8:29 ` [PULL 27/38] migration: Improve json and formatting Juan Quintela
@ 2023-10-17  8:29 ` Juan Quintela
  2023-10-17  8:29 ` [PULL 29/38] multifd: fix counters in multifd_send_thread Juan Quintela
                   ` (10 subsequent siblings)
  38 siblings, 0 replies; 49+ messages in thread
From: Juan Quintela @ 2023-10-17  8:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Thomas Huth, Fabiano Rosas, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Juan Quintela, Laurent Vivier,
	Elena Ufimtseva

From: Elena Ufimtseva <elena.ufimtseva@oracle.com>

In migration rate limiting atomic operations are used
to read the rate limit variables and transferred bytes and
they are expensive. Check first if rate_limit_max is equal
to RATE_LIMIT_DISABLED and return false immediately if so.

Note that with this patch we will also will stop flushing
by not calling qemu_fflush() from migration_transferred_bytes()
if the migration rate is not exceeded.
This should be fine since migration thread calls in the loop
migration_update_counters from migration_rate_limit() that
calls the migration_transferred_bytes() and flushes there.

Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231011184358.97349-2-elena.ufimtseva@oracle.com>
---
 migration/migration-stats.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/migration/migration-stats.c b/migration/migration-stats.c
index 84e11e6dd8..4cc989d975 100644
--- a/migration/migration-stats.c
+++ b/migration/migration-stats.c
@@ -24,14 +24,15 @@ bool migration_rate_exceeded(QEMUFile *f)
         return true;
     }
 
+    uint64_t rate_limit_max = migration_rate_get();
+    if (rate_limit_max == RATE_LIMIT_DISABLED) {
+        return false;
+    }
+
     uint64_t rate_limit_start = stat64_get(&mig_stats.rate_limit_start);
     uint64_t rate_limit_current = migration_transferred_bytes(f);
     uint64_t rate_limit_used = rate_limit_current - rate_limit_start;
-    uint64_t rate_limit_max = stat64_get(&mig_stats.rate_limit_max);
 
-    if (rate_limit_max == RATE_LIMIT_DISABLED) {
-        return false;
-    }
     if (rate_limit_max > 0 && rate_limit_used > rate_limit_max) {
         return true;
     }
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PULL 29/38] multifd: fix counters in multifd_send_thread
  2023-10-17  8:29 [PULL 00/38] Migration 20231017 patches Juan Quintela
                   ` (27 preceding siblings ...)
  2023-10-17  8:29 ` [PULL 28/38] migration: check for rate_limit_max for RATE_LIMIT_DISABLED Juan Quintela
@ 2023-10-17  8:29 ` Juan Quintela
  2023-10-17  8:29 ` [PULL 30/38] multifd: reset next_packet_len after sending pages Juan Quintela
                   ` (9 subsequent siblings)
  38 siblings, 0 replies; 49+ messages in thread
From: Juan Quintela @ 2023-10-17  8:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Thomas Huth, Fabiano Rosas, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Juan Quintela, Laurent Vivier,
	Elena Ufimtseva

From: Elena Ufimtseva <elena.ufimtseva@oracle.com>

Previous commit cbec7eb76879d419e7dbf531ee2506ec0722e825
"migration/multifd: Compute transferred bytes correctly"
removed accounting for packet_len in non-rdma
case, but the next_packet_size only accounts for pages, not for
the header packet (normal_pages * PAGE_SIZE) that is being sent
as iov[0]. The packet_len part should be added to account for
the size of MultiFDPacket and the array of the offsets.

Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231011184358.97349-4-elena.ufimtseva@oracle.com>
---
 migration/multifd.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/migration/multifd.c b/migration/multifd.c
index 0f6b203877..e6e0013c16 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -714,8 +714,6 @@ static void *multifd_send_thread(void *opaque)
                 if (ret != 0) {
                     break;
                 }
-                stat64_add(&mig_stats.multifd_bytes, p->packet_len);
-                stat64_add(&mig_stats.transferred, p->packet_len);
             } else {
                 /* Send header using the same writev call */
                 p->iov[0].iov_len = p->packet_len;
@@ -728,8 +726,10 @@ static void *multifd_send_thread(void *opaque)
                 break;
             }
 
-            stat64_add(&mig_stats.multifd_bytes, p->next_packet_size);
-            stat64_add(&mig_stats.transferred, p->next_packet_size);
+            stat64_add(&mig_stats.multifd_bytes,
+                       p->next_packet_size + p->packet_len);
+            stat64_add(&mig_stats.transferred,
+                       p->next_packet_size + p->packet_len);
             qemu_mutex_lock(&p->mutex);
             p->pending_job--;
             qemu_mutex_unlock(&p->mutex);
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PULL 30/38] multifd: reset next_packet_len after sending pages
  2023-10-17  8:29 [PULL 00/38] Migration 20231017 patches Juan Quintela
                   ` (28 preceding siblings ...)
  2023-10-17  8:29 ` [PULL 29/38] multifd: fix counters in multifd_send_thread Juan Quintela
@ 2023-10-17  8:29 ` Juan Quintela
  2023-10-17  8:29 ` [PULL 31/38] migration/ram: Refactor precopy ram loading code Juan Quintela
                   ` (8 subsequent siblings)
  38 siblings, 0 replies; 49+ messages in thread
From: Juan Quintela @ 2023-10-17  8:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Thomas Huth, Fabiano Rosas, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Juan Quintela, Laurent Vivier,
	Elena Ufimtseva

From: Elena Ufimtseva <elena.ufimtseva@oracle.com>

Sometimes multifd sends just sync packet with no pages
(normal_num is 0). In this case the old value is being
preserved and being accounted for while only packet_len
is being transferred.
Reset it to 0 after sending and accounting for.

Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231011184358.97349-5-elena.ufimtseva@oracle.com>
---
 migration/multifd.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/migration/multifd.c b/migration/multifd.c
index e6e0013c16..c45f5015f8 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -730,6 +730,7 @@ static void *multifd_send_thread(void *opaque)
                        p->next_packet_size + p->packet_len);
             stat64_add(&mig_stats.transferred,
                        p->next_packet_size + p->packet_len);
+            p->next_packet_size = 0;
             qemu_mutex_lock(&p->mutex);
             p->pending_job--;
             qemu_mutex_unlock(&p->mutex);
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PULL 31/38] migration/ram: Refactor precopy ram loading code
  2023-10-17  8:29 [PULL 00/38] Migration 20231017 patches Juan Quintela
                   ` (29 preceding siblings ...)
  2023-10-17  8:29 ` [PULL 30/38] multifd: reset next_packet_len after sending pages Juan Quintela
@ 2023-10-17  8:29 ` Juan Quintela
  2023-10-17  8:29 ` [PULL 32/38] migration/ram: Remove RAMState from xbzrle_cache_zero_page Juan Quintela
                   ` (7 subsequent siblings)
  38 siblings, 0 replies; 49+ messages in thread
From: Juan Quintela @ 2023-10-17  8:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Thomas Huth, Fabiano Rosas, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Juan Quintela, Laurent Vivier,
	Nikolay Borisov, Philippe Mathieu-Daudé

From: Nikolay Borisov <nborisov@suse.com>

Extract the ramblock parsing code into a routine that operates on the
sequence of headers from the stream and another the parses the
individual ramblock. This makes ram_load_precopy() easier to
comprehend.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231011184604.32364-3-farosas@suse.de>
---
 migration/ram.c | 146 +++++++++++++++++++++++++++---------------------
 1 file changed, 82 insertions(+), 64 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index a9bc6ae1f1..2bd51d6bb5 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -3884,6 +3884,85 @@ void colo_flush_ram_cache(void)
     trace_colo_flush_ram_cache_end();
 }
 
+static int parse_ramblock(QEMUFile *f, RAMBlock *block, ram_addr_t length)
+{
+    int ret = 0;
+    /* ADVISE is earlier, it shows the source has the postcopy capability on */
+    bool postcopy_advised = migration_incoming_postcopy_advised();
+
+    assert(block);
+
+    if (!qemu_ram_is_migratable(block)) {
+        error_report("block %s should not be migrated !", block->idstr);
+        return -EINVAL;
+    }
+
+    if (length != block->used_length) {
+        Error *local_err = NULL;
+
+        ret = qemu_ram_resize(block, length, &local_err);
+        if (local_err) {
+            error_report_err(local_err);
+        }
+    }
+    /* For postcopy we need to check hugepage sizes match */
+    if (postcopy_advised && migrate_postcopy_ram() &&
+        block->page_size != qemu_host_page_size) {
+        uint64_t remote_page_size = qemu_get_be64(f);
+        if (remote_page_size != block->page_size) {
+            error_report("Mismatched RAM page size %s "
+                         "(local) %zd != %" PRId64, block->idstr,
+                         block->page_size, remote_page_size);
+            ret = -EINVAL;
+        }
+    }
+    if (migrate_ignore_shared()) {
+        hwaddr addr = qemu_get_be64(f);
+        if (migrate_ram_is_ignored(block) &&
+            block->mr->addr != addr) {
+            error_report("Mismatched GPAs for block %s "
+                         "%" PRId64 "!= %" PRId64, block->idstr,
+                         (uint64_t)addr, (uint64_t)block->mr->addr);
+            ret = -EINVAL;
+        }
+    }
+    ret = rdma_block_notification_handle(f, block->idstr);
+    if (ret < 0) {
+        qemu_file_set_error(f, ret);
+    }
+
+    return ret;
+}
+
+static int parse_ramblocks(QEMUFile *f, ram_addr_t total_ram_bytes)
+{
+    int ret = 0;
+
+    /* Synchronize RAM block list */
+    while (!ret && total_ram_bytes) {
+        RAMBlock *block;
+        char id[256];
+        ram_addr_t length;
+        int len = qemu_get_byte(f);
+
+        qemu_get_buffer(f, (uint8_t *)id, len);
+        id[len] = 0;
+        length = qemu_get_be64(f);
+
+        block = qemu_ram_block_by_name(id);
+        if (block) {
+            ret = parse_ramblock(f, block, length);
+        } else {
+            error_report("Unknown ramblock \"%s\", cannot accept "
+                         "migration", id);
+            ret = -EINVAL;
+        }
+        total_ram_bytes -= length;
+    }
+
+    return ret;
+}
+
 /**
  * ram_load_precopy: load pages in precopy case
  *
@@ -3898,14 +3977,13 @@ static int ram_load_precopy(QEMUFile *f)
 {
     MigrationIncomingState *mis = migration_incoming_get_current();
     int flags = 0, ret = 0, invalid_flags = 0, len = 0, i = 0;
-    /* ADVISE is earlier, it shows the source has the postcopy capability on */
-    bool postcopy_advised = migration_incoming_postcopy_advised();
+
     if (!migrate_compress()) {
         invalid_flags |= RAM_SAVE_FLAG_COMPRESS_PAGE;
     }
 
     while (!ret && !(flags & RAM_SAVE_FLAG_EOS)) {
-        ram_addr_t addr, total_ram_bytes;
+        ram_addr_t addr;
         void *host = NULL, *host_bak = NULL;
         uint8_t ch;
 
@@ -3976,67 +4054,7 @@ static int ram_load_precopy(QEMUFile *f)
 
         switch (flags & ~RAM_SAVE_FLAG_CONTINUE) {
         case RAM_SAVE_FLAG_MEM_SIZE:
-            /* Synchronize RAM block list */
-            total_ram_bytes = addr;
-            while (!ret && total_ram_bytes) {
-                RAMBlock *block;
-                char id[256];
-                ram_addr_t length;
-
-                len = qemu_get_byte(f);
-                qemu_get_buffer(f, (uint8_t *)id, len);
-                id[len] = 0;
-                length = qemu_get_be64(f);
-
-                block = qemu_ram_block_by_name(id);
-                if (block && !qemu_ram_is_migratable(block)) {
-                    error_report("block %s should not be migrated !", id);
-                    ret = -EINVAL;
-                } else if (block) {
-                    if (length != block->used_length) {
-                        Error *local_err = NULL;
-
-                        ret = qemu_ram_resize(block, length,
-                                              &local_err);
-                        if (local_err) {
-                            error_report_err(local_err);
-                        }
-                    }
-                    /* For postcopy we need to check hugepage sizes match */
-                    if (postcopy_advised && migrate_postcopy_ram() &&
-                        block->page_size != qemu_host_page_size) {
-                        uint64_t remote_page_size = qemu_get_be64(f);
-                        if (remote_page_size != block->page_size) {
-                            error_report("Mismatched RAM page size %s "
-                                         "(local) %zd != %" PRId64,
-                                         id, block->page_size,
-                                         remote_page_size);
-                            ret = -EINVAL;
-                        }
-                    }
-                    if (migrate_ignore_shared()) {
-                        hwaddr addr2 = qemu_get_be64(f);
-                        if (migrate_ram_is_ignored(block) &&
-                            block->mr->addr != addr2) {
-                            error_report("Mismatched GPAs for block %s "
-                                         "%" PRId64 "!= %" PRId64,
-                                         id, (uint64_t)addr2,
-                                         (uint64_t)block->mr->addr);
-                            ret = -EINVAL;
-                        }
-                    }
-                    ret = rdma_block_notification_handle(f, block->idstr);
-                    if (ret < 0) {
-                        qemu_file_set_error(f, ret);
-                    }
-                } else {
-                    error_report("Unknown ramblock \"%s\", cannot "
-                                 "accept migration", id);
-                    ret = -EINVAL;
-                }
-
-                total_ram_bytes -= length;
-            }
+            ret = parse_ramblocks(f, addr);
             break;
 
         case RAM_SAVE_FLAG_ZERO:
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PULL 32/38] migration/ram: Remove RAMState from xbzrle_cache_zero_page
  2023-10-17  8:29 [PULL 00/38] Migration 20231017 patches Juan Quintela
                   ` (30 preceding siblings ...)
  2023-10-17  8:29 ` [PULL 31/38] migration/ram: Refactor precopy ram loading code Juan Quintela
@ 2023-10-17  8:29 ` Juan Quintela
  2023-10-17  8:29 ` [PULL 33/38] migration/ram: Stop passing QEMUFile around in save_zero_page Juan Quintela
                   ` (6 subsequent siblings)
  38 siblings, 0 replies; 49+ messages in thread
From: Juan Quintela @ 2023-10-17  8:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Thomas Huth, Fabiano Rosas, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Juan Quintela, Laurent Vivier

From: Fabiano Rosas <farosas@suse.de>

'rs' is not used in that function. It's a leftover from commit
9360447d34 ("ram: Use MigrationStats for statistics").

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231011184604.32364-4-farosas@suse.de>
---
 migration/ram.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 2bd51d6bb5..535172d9be 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -570,7 +570,6 @@ void mig_throttle_counter_reset(void)
 /**
  * xbzrle_cache_zero_page: insert a zero page in the XBZRLE cache
  *
- * @rs: current RAM state
  * @current_addr: address for the zero page
  *
  * Update the xbzrle cache to reflect a page that's been sent as all 0.
@@ -579,7 +578,7 @@ void mig_throttle_counter_reset(void)
  * As a bonus, if the page wasn't in the cache it gets added so that
  * when a small write is made into the 0'd page it gets XBZRLE sent.
  */
-static void xbzrle_cache_zero_page(RAMState *rs, ram_addr_t current_addr)
+static void xbzrle_cache_zero_page(ram_addr_t current_addr)
 {
     /* We don't care if this fails to allocate a new cache page
      * as long as it updated an old one */
@@ -2146,7 +2145,7 @@ static int ram_save_target_page_legacy(RAMState *rs, PageSearchStatus *pss)
          */
         if (rs->xbzrle_started) {
             XBZRLE_cache_lock();
-            xbzrle_cache_zero_page(rs, block->offset + offset);
+            xbzrle_cache_zero_page(block->offset + offset);
             XBZRLE_cache_unlock();
         }
         return res;
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PULL 33/38] migration/ram: Stop passing QEMUFile around in save_zero_page
  2023-10-17  8:29 [PULL 00/38] Migration 20231017 patches Juan Quintela
                   ` (31 preceding siblings ...)
  2023-10-17  8:29 ` [PULL 32/38] migration/ram: Remove RAMState from xbzrle_cache_zero_page Juan Quintela
@ 2023-10-17  8:29 ` Juan Quintela
  2023-10-17  8:29 ` [PULL 34/38] migration/ram: Move xbzrle zero page handling into save_zero_page Juan Quintela
                   ` (5 subsequent siblings)
  38 siblings, 0 replies; 49+ messages in thread
From: Juan Quintela @ 2023-10-17  8:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Thomas Huth, Fabiano Rosas, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Juan Quintela, Laurent Vivier

From: Fabiano Rosas <farosas@suse.de>

We don't need the QEMUFile when we're already passing the
PageSearchStatus.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231011184604.32364-5-farosas@suse.de>
---
 migration/ram.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 535172d9be..2ec28c4507 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1147,10 +1147,11 @@ void ram_release_page(const char *rbname, uint64_t offset)
  * @block: block that contains the page we want to send
  * @offset: offset inside the block for the page
  */
-static int save_zero_page_to_file(PageSearchStatus *pss, QEMUFile *file,
-                                  RAMBlock *block, ram_addr_t offset)
+static int save_zero_page_to_file(PageSearchStatus *pss, RAMBlock *block,
+                                  ram_addr_t offset)
 {
     uint8_t *p = block->host + offset;
+    QEMUFile *file = pss->pss_channel;
     int len = 0;
 
     if (buffer_is_zero(p, TARGET_PAGE_SIZE)) {
@@ -1171,10 +1172,10 @@ static int save_zero_page_to_file(PageSearchStatus *pss, QEMUFile *file,
  * @block: block that contains the page we want to send
  * @offset: offset inside the block for the page
  */
-static int save_zero_page(PageSearchStatus *pss, QEMUFile *f, RAMBlock *block,
+static int save_zero_page(PageSearchStatus *pss, RAMBlock *block,
                           ram_addr_t offset)
 {
-    int len = save_zero_page_to_file(pss, f, block, offset);
+    int len = save_zero_page_to_file(pss, block, offset);
 
     if (len) {
         stat64_add(&mig_stats.zero_pages, 1);
@@ -2138,7 +2139,7 @@ static int ram_save_target_page_legacy(RAMState *rs, PageSearchStatus *pss)
         return 1;
     }
 
-    res = save_zero_page(pss, pss->pss_channel, block, offset);
+    res = save_zero_page(pss, block, offset);
     if (res > 0) {
         /* Must let xbzrle know, otherwise a previous (now 0'd) cached
          * page would be stale
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PULL 34/38] migration/ram: Move xbzrle zero page handling into save_zero_page
  2023-10-17  8:29 [PULL 00/38] Migration 20231017 patches Juan Quintela
                   ` (32 preceding siblings ...)
  2023-10-17  8:29 ` [PULL 33/38] migration/ram: Stop passing QEMUFile around in save_zero_page Juan Quintela
@ 2023-10-17  8:29 ` Juan Quintela
  2023-10-17  8:30 ` [PULL 35/38] migration/ram: Merge save_zero_page functions Juan Quintela
                   ` (4 subsequent siblings)
  38 siblings, 0 replies; 49+ messages in thread
From: Juan Quintela @ 2023-10-17  8:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Thomas Huth, Fabiano Rosas, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Juan Quintela, Laurent Vivier

From: Fabiano Rosas <farosas@suse.de>

It makes a bit more sense to have the zero page handling of xbzrle
right where we save the zero page.

Also invert the exit condition to remove one level of indentation
which makes the next patch easier to grasp.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231011184604.32364-6-farosas@suse.de>
---
 migration/ram.c | 35 ++++++++++++++++++++---------------
 1 file changed, 20 insertions(+), 15 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 2ec28c4507..229cad5c74 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1168,21 +1168,34 @@ static int save_zero_page_to_file(PageSearchStatus *pss, RAMBlock *block,
  *
  * Returns the number of pages written.
  *
+ * @rs: current RAM state
  * @pss: current PSS channel
  * @block: block that contains the page we want to send
  * @offset: offset inside the block for the page
  */
-static int save_zero_page(PageSearchStatus *pss, RAMBlock *block,
+static int save_zero_page(RAMState *rs, PageSearchStatus *pss, RAMBlock *block,
                           ram_addr_t offset)
 {
     int len = save_zero_page_to_file(pss, block, offset);
 
-    if (len) {
-        stat64_add(&mig_stats.zero_pages, 1);
-        ram_transferred_add(len);
-        return 1;
+    if (!len) {
+        return -1;
     }
-    return -1;
+
+    stat64_add(&mig_stats.zero_pages, 1);
+    ram_transferred_add(len);
+
+    /*
+     * Must let xbzrle know, otherwise a previous (now 0'd) cached
+     * page would be stale.
+     */
+    if (rs->xbzrle_started) {
+        XBZRLE_cache_lock();
+        xbzrle_cache_zero_page(block->offset + offset);
+        XBZRLE_cache_unlock();
+    }
+
+    return 1;
 }
 
 /*
@@ -2139,16 +2152,8 @@ static int ram_save_target_page_legacy(RAMState *rs, PageSearchStatus *pss)
         return 1;
     }
 
-    res = save_zero_page(pss, block, offset);
+    res = save_zero_page(rs, pss, block, offset);
     if (res > 0) {
-        /* Must let xbzrle know, otherwise a previous (now 0'd) cached
-         * page would be stale
-         */
-        if (rs->xbzrle_started) {
-            XBZRLE_cache_lock();
-            xbzrle_cache_zero_page(block->offset + offset);
-            XBZRLE_cache_unlock();
-        }
         return res;
     }
 
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PULL 35/38] migration/ram: Merge save_zero_page functions
  2023-10-17  8:29 [PULL 00/38] Migration 20231017 patches Juan Quintela
                   ` (33 preceding siblings ...)
  2023-10-17  8:29 ` [PULL 34/38] migration/ram: Move xbzrle zero page handling into save_zero_page Juan Quintela
@ 2023-10-17  8:30 ` Juan Quintela
  2023-10-17  8:30 ` [PULL 36/38] migration/multifd: Remove direct "socket" references Juan Quintela
                   ` (3 subsequent siblings)
  38 siblings, 0 replies; 49+ messages in thread
From: Juan Quintela @ 2023-10-17  8:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Thomas Huth, Fabiano Rosas, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Juan Quintela, Laurent Vivier

From: Fabiano Rosas <farosas@suse.de>

We don't need to do this in two pieces. One single function makes it
easier to grasp, specially since it removes the indirection on the
return value handling.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231011184604.32364-7-farosas@suse.de>
---
 migration/ram.c | 46 +++++++++++++---------------------------------
 1 file changed, 13 insertions(+), 33 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 229cad5c74..c844151ee9 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1137,32 +1137,6 @@ void ram_release_page(const char *rbname, uint64_t offset)
     ram_discard_range(rbname, offset, TARGET_PAGE_SIZE);
 }
 
-/**
- * save_zero_page_to_file: send the zero page to the file
- *
- * Returns the size of data written to the file, 0 means the page is not
- * a zero page
- *
- * @pss: current PSS channel
- * @block: block that contains the page we want to send
- * @offset: offset inside the block for the page
- */
-static int save_zero_page_to_file(PageSearchStatus *pss, RAMBlock *block,
-                                  ram_addr_t offset)
-{
-    uint8_t *p = block->host + offset;
-    QEMUFile *file = pss->pss_channel;
-    int len = 0;
-
-    if (buffer_is_zero(p, TARGET_PAGE_SIZE)) {
-        len += save_page_header(pss, file, block, offset | RAM_SAVE_FLAG_ZERO);
-        qemu_put_byte(file, 0);
-        len += 1;
-        ram_release_page(block->idstr, offset);
-    }
-    return len;
-}
-
 /**
  * save_zero_page: send the zero page to the stream
  *
@@ -1176,12 +1150,19 @@ static int save_zero_page_to_file(PageSearchStatus *pss, RAMBlock *block,
 static int save_zero_page(RAMState *rs, PageSearchStatus *pss, RAMBlock *block,
                           ram_addr_t offset)
 {
-    int len = save_zero_page_to_file(pss, block, offset);
+    uint8_t *p = block->host + offset;
+    QEMUFile *file = pss->pss_channel;
+    int len = 0;
 
-    if (!len) {
-        return -1;
+    if (!buffer_is_zero(p, TARGET_PAGE_SIZE)) {
+        return 0;
     }
 
+    len += save_page_header(pss, file, block, offset | RAM_SAVE_FLAG_ZERO);
+    qemu_put_byte(file, 0);
+    len += 1;
+    ram_release_page(block->idstr, offset);
+
     stat64_add(&mig_stats.zero_pages, 1);
     ram_transferred_add(len);
 
@@ -1195,7 +1176,7 @@ static int save_zero_page(RAMState *rs, PageSearchStatus *pss, RAMBlock *block,
         XBZRLE_cache_unlock();
     }
 
-    return 1;
+    return len;
 }
 
 /*
@@ -2152,9 +2133,8 @@ static int ram_save_target_page_legacy(RAMState *rs, PageSearchStatus *pss)
         return 1;
     }
 
-    res = save_zero_page(rs, pss, block, offset);
-    if (res > 0) {
-        return res;
+    if (save_zero_page(rs, pss, block, offset)) {
+        return 1;
     }
 
     /*
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PULL 36/38] migration/multifd: Remove direct "socket" references
  2023-10-17  8:29 [PULL 00/38] Migration 20231017 patches Juan Quintela
                   ` (34 preceding siblings ...)
  2023-10-17  8:30 ` [PULL 35/38] migration/ram: Merge save_zero_page functions Juan Quintela
@ 2023-10-17  8:30 ` Juan Quintela
  2023-10-17  8:30 ` [PULL 37/38] migration/multifd: Unify multifd_send_thread error paths Juan Quintela
                   ` (2 subsequent siblings)
  38 siblings, 0 replies; 49+ messages in thread
From: Juan Quintela @ 2023-10-17  8:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Thomas Huth, Fabiano Rosas, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Juan Quintela, Laurent Vivier,
	Philippe Mathieu-Daudé

From: Fabiano Rosas <farosas@suse.de>

We're about to enable support for other transports in multifd, so
remove direct references to sockets.

Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231012134343.23757-2-farosas@suse.de>
---
 migration/multifd.c | 22 ++++++++++++++++------
 1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/migration/multifd.c b/migration/multifd.c
index c45f5015f8..8e9a5ee394 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -510,6 +510,11 @@ static void multifd_send_terminate_threads(Error *err)
     }
 }
 
+static int multifd_send_channel_destroy(QIOChannel *send)
+{
+    return socket_send_channel_destroy(send);
+}
+
 void multifd_save_cleanup(void)
 {
     int i;
@@ -532,7 +537,7 @@ void multifd_save_cleanup(void)
         if (p->registered_yank) {
             migration_ioc_unregister_yank(p->c);
         }
-        socket_send_channel_destroy(p->c);
+        multifd_send_channel_destroy(p->c);
         p->c = NULL;
         qemu_mutex_destroy(&p->mutex);
         qemu_sem_destroy(&p->sem);
@@ -890,20 +895,25 @@ static void multifd_new_send_channel_cleanup(MultiFDSendParams *p,
 static void multifd_new_send_channel_async(QIOTask *task, gpointer opaque)
 {
     MultiFDSendParams *p = opaque;
-    QIOChannel *sioc = QIO_CHANNEL(qio_task_get_source(task));
+    QIOChannel *ioc = QIO_CHANNEL(qio_task_get_source(task));
     Error *local_err = NULL;
 
     trace_multifd_new_send_channel_async(p->id);
     if (!qio_task_propagate_error(task, &local_err)) {
-        p->c = sioc;
+        p->c = ioc;
         qio_channel_set_delay(p->c, false);
         p->running = true;
-        if (multifd_channel_connect(p, sioc, local_err)) {
+        if (multifd_channel_connect(p, ioc, local_err)) {
             return;
         }
     }
 
-    multifd_new_send_channel_cleanup(p, sioc, local_err);
+    multifd_new_send_channel_cleanup(p, ioc, local_err);
+}
+
+static void multifd_new_send_channel_create(gpointer opaque)
+{
+    socket_send_channel_create(multifd_new_send_channel_async, opaque);
 }
 
 int multifd_save_setup(Error **errp)
@@ -952,7 +962,7 @@ int multifd_save_setup(Error **errp)
             p->write_flags = 0;
         }
 
-        socket_send_channel_create(multifd_new_send_channel_async, p);
+        multifd_new_send_channel_create(p);
     }
 
     for (i = 0; i < thread_count; i++) {
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PULL 37/38] migration/multifd: Unify multifd_send_thread error paths
  2023-10-17  8:29 [PULL 00/38] Migration 20231017 patches Juan Quintela
                   ` (35 preceding siblings ...)
  2023-10-17  8:30 ` [PULL 36/38] migration/multifd: Remove direct "socket" references Juan Quintela
@ 2023-10-17  8:30 ` Juan Quintela
  2023-10-17  8:30 ` [PULL 38/38] migration/multifd: Clarify Error usage in multifd_channel_connect Juan Quintela
  2023-10-17 19:04 ` [PULL 00/38] Migration 20231017 patches Stefan Hajnoczi
  38 siblings, 0 replies; 49+ messages in thread
From: Juan Quintela @ 2023-10-17  8:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Thomas Huth, Fabiano Rosas, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Juan Quintela, Laurent Vivier

From: Fabiano Rosas <farosas@suse.de>

The preferred usage of the Error type is to always set both the return
code and the error when a failure happens. As all code called from the
send thread follows this pattern, we'll always have the return code
and the error set at the same time.

Aside from the convention, in this piece of code this must be the
case, otherwise the if (ret != 0) would be exiting the thread without
calling multifd_send_terminate_threads() which is incorrect.

Unify both paths to make it clear that both are taken when there's an
error.

Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231012134343.23757-3-farosas@suse.de>
---
 migration/multifd.c | 12 +++---------
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/migration/multifd.c b/migration/multifd.c
index 8e9a5ee394..c92955de41 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -753,19 +753,13 @@ static void *multifd_send_thread(void *opaque)
     }
 
 out:
-    if (local_err) {
+    if (ret) {
+        assert(local_err);
         trace_multifd_send_error(p->id);
         multifd_send_terminate_threads(local_err);
-        error_free(local_err);
-    }
-
-    /*
-     * Error happen, I will exit, but I can't just leave, tell
-     * who pay attention to me.
-     */
-    if (ret != 0) {
         qemu_sem_post(&p->sem_sync);
         qemu_sem_post(&multifd_send_state->channels_ready);
+        error_free(local_err);
     }
 
     qemu_mutex_lock(&p->mutex);
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PULL 38/38] migration/multifd: Clarify Error usage in multifd_channel_connect
  2023-10-17  8:29 [PULL 00/38] Migration 20231017 patches Juan Quintela
                   ` (36 preceding siblings ...)
  2023-10-17  8:30 ` [PULL 37/38] migration/multifd: Unify multifd_send_thread error paths Juan Quintela
@ 2023-10-17  8:30 ` Juan Quintela
  2023-10-17 19:04 ` [PULL 00/38] Migration 20231017 patches Stefan Hajnoczi
  38 siblings, 0 replies; 49+ messages in thread
From: Juan Quintela @ 2023-10-17  8:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Thomas Huth, Fabiano Rosas, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Juan Quintela, Laurent Vivier,
	Philippe Mathieu-Daudé

From: Fabiano Rosas <farosas@suse.de>

The function is currently called from two sites, one always gives it a
NULL Error and the other always gives it a non-NULL Error.

In the non-NULL case, all it does it trace the error and return. One
of the callers already have tracing, add a tracepoint to the other and
stop passing the error into the function.

Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231012134343.23757-4-farosas@suse.de>
---
 migration/multifd.c    | 60 ++++++++++++++++++++----------------------
 migration/trace-events |  3 ++-
 2 files changed, 30 insertions(+), 33 deletions(-)

diff --git a/migration/multifd.c b/migration/multifd.c
index c92955de41..1fe53d3b98 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -775,7 +775,7 @@ out:
 
 static bool multifd_channel_connect(MultiFDSendParams *p,
                                     QIOChannel *ioc,
-                                    Error *error);
+                                    Error **errp);
 
 static void multifd_tls_outgoing_handshake(QIOTask *task,
                                            gpointer opaque)
@@ -784,21 +784,22 @@ static void multifd_tls_outgoing_handshake(QIOTask *task,
     QIOChannel *ioc = QIO_CHANNEL(qio_task_get_source(task));
     Error *err = NULL;
 
-    if (qio_task_propagate_error(task, &err)) {
-        trace_multifd_tls_outgoing_handshake_error(ioc, error_get_pretty(err));
-    } else {
+    if (!qio_task_propagate_error(task, &err)) {
         trace_multifd_tls_outgoing_handshake_complete(ioc);
+        if (multifd_channel_connect(p, ioc, &err)) {
+            return;
+        }
     }
 
-    if (!multifd_channel_connect(p, ioc, err)) {
-        /*
-         * Error happen, mark multifd_send_thread status as 'quit' although it
-         * is not created, and then tell who pay attention to me.
-         */
-        p->quit = true;
-        qemu_sem_post(&multifd_send_state->channels_ready);
-        qemu_sem_post(&p->sem_sync);
-    }
+    trace_multifd_tls_outgoing_handshake_error(ioc, error_get_pretty(err));
+
+    /*
+     * Error happen, mark multifd_send_thread status as 'quit' although it
+     * is not created, and then tell who pay attention to me.
+     */
+    p->quit = true;
+    qemu_sem_post(&multifd_send_state->channels_ready);
+    qemu_sem_post(&p->sem_sync);
 }
 
 static void *multifd_tls_handshake_thread(void *opaque)
@@ -814,7 +815,7 @@ static void *multifd_tls_handshake_thread(void *opaque)
     return NULL;
 }
 
-static void multifd_tls_channel_connect(MultiFDSendParams *p,
+static bool multifd_tls_channel_connect(MultiFDSendParams *p,
                                         QIOChannel *ioc,
                                         Error **errp)
 {
@@ -824,7 +825,7 @@ static void multifd_tls_channel_connect(MultiFDSendParams *p,
 
     tioc = migration_tls_client_create(ioc, hostname, errp);
     if (!tioc) {
-        return;
+        return false;
     }
 
     object_unref(OBJECT(ioc));
@@ -834,31 +835,25 @@ static void multifd_tls_channel_connect(MultiFDSendParams *p,
     qemu_thread_create(&p->thread, "multifd-tls-handshake-worker",
                        multifd_tls_handshake_thread, p,
                        QEMU_THREAD_JOINABLE);
+    return true;
 }
 
 static bool multifd_channel_connect(MultiFDSendParams *p,
                                     QIOChannel *ioc,
-                                    Error *error)
+                                    Error **errp)
 {
     trace_multifd_set_outgoing_channel(
         ioc, object_get_typename(OBJECT(ioc)),
-        migrate_get_current()->hostname, error);
+        migrate_get_current()->hostname);
 
-    if (error) {
-        return false;
-    }
     if (migrate_channel_requires_tls_upgrade(ioc)) {
-        multifd_tls_channel_connect(p, ioc, &error);
-        if (!error) {
-            /*
-             * tls_channel_connect will call back to this
-             * function after the TLS handshake,
-             * so we mustn't call multifd_send_thread until then
-             */
-            return true;
-        } else {
-            return false;
-        }
+        /*
+         * tls_channel_connect will call back to this
+         * function after the TLS handshake,
+         * so we mustn't call multifd_send_thread until then
+         */
+        return multifd_tls_channel_connect(p, ioc, errp);
+
     } else {
         migration_ioc_register_yank(ioc);
         p->registered_yank = true;
@@ -897,11 +892,12 @@ static void multifd_new_send_channel_async(QIOTask *task, gpointer opaque)
         p->c = ioc;
         qio_channel_set_delay(p->c, false);
         p->running = true;
-        if (multifd_channel_connect(p, ioc, local_err)) {
+        if (multifd_channel_connect(p, ioc, &local_err)) {
             return;
         }
     }
 
+    trace_multifd_new_send_channel_async_error(p->id, local_err);
     multifd_new_send_channel_cleanup(p, ioc, local_err);
 }
 
diff --git a/migration/trace-events b/migration/trace-events
index 403cc1ae11..fa9486dffe 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -125,6 +125,7 @@ postcopy_preempt_reset_channel(void) ""
 
 # multifd.c
 multifd_new_send_channel_async(uint8_t id) "channel %u"
+multifd_new_send_channel_async_error(uint8_t id, void *err) "channel=%u err=%p"
 multifd_recv(uint8_t id, uint64_t packet_num, uint32_t used, uint32_t flags, uint32_t next_packet_size) "channel %u packet_num %" PRIu64 " pages %u flags 0x%x next packet size %u"
 multifd_recv_new_channel(uint8_t id) "channel %u"
 multifd_recv_sync_main(long packet_num) "packet num %ld"
@@ -144,7 +145,7 @@ multifd_send_thread_start(uint8_t id) "%u"
 multifd_tls_outgoing_handshake_start(void *ioc, void *tioc, const char *hostname) "ioc=%p tioc=%p hostname=%s"
 multifd_tls_outgoing_handshake_error(void *ioc, const char *err) "ioc=%p err=%s"
 multifd_tls_outgoing_handshake_complete(void *ioc) "ioc=%p"
-multifd_set_outgoing_channel(void *ioc, const char *ioctype, const char *hostname, void *err)  "ioc=%p ioctype=%s hostname=%s err=%p"
+multifd_set_outgoing_channel(void *ioc, const char *ioctype, const char *hostname)  "ioc=%p ioctype=%s hostname=%s"
 
 # migration.c
 await_return_path_close_on_source_close(void) ""
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PULL 00/38] Migration 20231017 patches
  2023-10-17  8:29 [PULL 00/38] Migration 20231017 patches Juan Quintela
                   ` (37 preceding siblings ...)
  2023-10-17  8:30 ` [PULL 38/38] migration/multifd: Clarify Error usage in multifd_channel_connect Juan Quintela
@ 2023-10-17 19:04 ` Stefan Hajnoczi
  38 siblings, 0 replies; 49+ messages in thread
From: Stefan Hajnoczi @ 2023-10-17 19:04 UTC (permalink / raw)
  To: Juan Quintela
  Cc: qemu-devel, Eric Blake, Thomas Huth, Fabiano Rosas, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Juan Quintela, Laurent Vivier

[-- Attachment #1: Type: text/plain, Size: 115 bytes --]

Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/8.2 for any user-visible changes.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PULL 13/38] migration: Non multifd migration don't care about multifd flushes
  2023-10-17  8:29 ` [PULL 13/38] migration: Non multifd migration don't care about multifd flushes Juan Quintela
@ 2023-10-19 11:47   ` Michael Tokarev
  2023-10-19 12:03     ` Juan Quintela
  0 siblings, 1 reply; 49+ messages in thread
From: Michael Tokarev @ 2023-10-19 11:47 UTC (permalink / raw)
  To: Juan Quintela, qemu-devel
  Cc: Eric Blake, Thomas Huth, Fabiano Rosas, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Laurent Vivier

17.10.2023 11:29, Juan Quintela:
> RDMA was having trouble because
> migrate_multifd_flush_after_each_section() can only be true or false,
> but we don't want to send any flush when we are not in multifd
> migration.
> 
> CC: Fabiano Rosas <farosas@suse.de
> Fixes: 294e5a4034e81 ("multifd: Only flush once each full round of memory")

Is it worth to pick it up for stable-8.1?

/mjt


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PULL 13/38] migration: Non multifd migration don't care about multifd flushes
  2023-10-19 11:47   ` Michael Tokarev
@ 2023-10-19 12:03     ` Juan Quintela
  0 siblings, 0 replies; 49+ messages in thread
From: Juan Quintela @ 2023-10-19 12:03 UTC (permalink / raw)
  To: Michael Tokarev
  Cc: qemu-devel, Eric Blake, Thomas Huth, Fabiano Rosas, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Laurent Vivier

Michael Tokarev <mjt@tls.msk.ru> wrote:
> 17.10.2023 11:29, Juan Quintela:
>> RDMA was having trouble because
>> migrate_multifd_flush_after_each_section() can only be true or false,
>> but we don't want to send any flush when we are not in multifd
>> migration.
>> CC: Fabiano Rosas <farosas@suse.de
>> Fixes: 294e5a4034e81 ("multifd: Only flush once each full round of memory")
>
> Is it worth to pick it up for stable-8.1?

Yeap, please.

Later, Juan.



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PULL 10/38] tests/qtest/migration: Add a test for the analyze-migration script
  2023-10-17  8:29 ` [PULL 10/38] tests/qtest/migration: Add a test for the analyze-migration script Juan Quintela
@ 2024-05-21 12:24   ` Alex Bennée
  2024-05-21 12:46     ` Fabiano Rosas
  0 siblings, 1 reply; 49+ messages in thread
From: Alex Bennée @ 2024-05-21 12:24 UTC (permalink / raw)
  To: Juan Quintela
  Cc: qemu-devel, Eric Blake, Thomas Huth, Fabiano Rosas, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Laurent Vivier, Richard Henderson

Juan Quintela <quintela@redhat.com> writes:

> From: Fabiano Rosas <farosas@suse.de>
>
> Add a smoke test that migrates to a file and gives it to the
> script. It should catch the most annoying errors such as changes in
> the ram flags.
>
> After code has been merged it becomes way harder to figure out what is
> causing the script to fail, the person making the change is the most
> likely to know right away what the problem is.
>
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> Acked-by: Thomas Huth <thuth@redhat.com>
> Reviewed-by: Juan Quintela <quintela@redhat.com>
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> Message-ID: <20231009184326.15777-7-farosas@suse.de>

I bisected the failures I'm seeing on s390x to the introduction of this
script. I don't know if its simply a timeout on a relatively slow VM:

Summary of Failures:

 36/546 qemu:qtest+qtest-s390x / qtest-s390x/migration-test                       ERROR          93.51s   killed by signal 6 SIGABRT

It seems to be unstable as we pass sometimes:

11:26:42 [ajb@qemu01:~/l/q/b/system] master|… + ./pyvenv/bin/meson test --repeat 100 qtest-s390x/migration-test
ninja: Entering directory `/home/ajb/lsrc/qemu.git/builds/system'
[1/9] Generating qemu-version.h with a custom command (wrapped by meson to capture output)
  1/100 qemu:qtest+qtest-s390x / qtest-s390x/migration-test          ERROR          251.98s   killed by signal 6 SIGABRT
>>> MALLOC_PERTURB_=9 PYTHON=/home/ajb/lsrc/qemu.git/builds/system/pyvenv/bin/python3 G_TEST_DBUS_DAEMON=/home/ajb/lsrc/qemu.git/tests/dbus-vmstate-daemon.sh QTEST_QEMU_BINARY=./qemu-system-s390x QTEST_QEMU_IMG=./qemu-img QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon /home/ajb/lsrc/qemu.git/builds/system/tests/qtest/migration-test --tap -k

  2/100 qemu:qtest+qtest-s390x / qtest-s390x/migration-test          ERROR          258.71s   killed by signal 6 SIGABRT
>>> PYTHON=/home/ajb/lsrc/qemu.git/builds/system/pyvenv/bin/python3 MALLOC_PERTURB_=205 G_TEST_DBUS_DAEMON=/home/ajb/lsrc/qemu.git/tests/dbus-vmstate-daemon.sh QTEST_QEMU_BINARY=./qemu-system-s390x QTEST_QEMU_IMG=./qemu-img QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon /home/ajb/lsrc/qemu.git/builds/system/tests/qtest/migration-test --tap -k

  3/100 qemu:qtest+qtest-s390x / qtest-s390x/migration-test          OK             302.53s   46 subtests passed
  4/100 qemu:qtest+qtest-s390x / qtest-s390x/migration-test          OK             319.56s   46 subtests passed
  5/100 qemu:qtest+qtest-s390x / qtest-s390x/migration-test          OK             320.11s   46 subtests passed
  6/100 qemu:qtest+qtest-s390x / qtest-s390x/migration-test          OK             328.40s   46 subtests passed

Ok:                 4   
Expected Fail:      0   
Fail:               2   
Unexpected Pass:    0   
Skipped:            0   
Timeout:            0   

> ---
>  tests/qtest/migration-test.c | 60 ++++++++++++++++++++++++++++++++++++
>  tests/qtest/meson.build      |  2 ++
>  2 files changed, 62 insertions(+)
>
> diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
> index 8eb2053dbb..cef5081f8c 100644
> --- a/tests/qtest/migration-test.c
> +++ b/tests/qtest/migration-test.c
> @@ -66,6 +66,8 @@ static bool got_dst_resume;
>   */
>  #define DIRTYLIMIT_TOLERANCE_RANGE  25  /* MB/s */
>  
> +#define ANALYZE_SCRIPT "scripts/analyze-migration.py"
> +
>  #if defined(__linux__)
>  #include <sys/syscall.h>
>  #include <sys/vfs.h>
> @@ -1501,6 +1503,61 @@ static void test_baddest(void)
>      test_migrate_end(from, to, false);
>  }
>  
> +#ifndef _WIN32
> +static void test_analyze_script(void)
> +{
> +    MigrateStart args = {
> +        .opts_source = "-uuid 11111111-1111-1111-1111-111111111111",
> +    };
> +    QTestState *from, *to;
> +    g_autofree char *uri = NULL;
> +    g_autofree char *file = NULL;
> +    int pid, wstatus;
> +    const char *python = g_getenv("PYTHON");
> +
> +    if (!python) {
> +        g_test_skip("PYTHON variable not set");
> +        return;
> +    }
> +
> +    /* dummy url */
> +    if (test_migrate_start(&from, &to, "tcp:127.0.0.1:0", &args)) {
> +        return;
> +    }
> +
> +    /*
> +     * Setting these two capabilities causes the "configuration"
> +     * vmstate to include subsections for them. The script needs to
> +     * parse those subsections properly.
> +     */
> +    migrate_set_capability(from, "validate-uuid", true);
> +    migrate_set_capability(from, "x-ignore-shared", true);
> +
> +    file = g_strdup_printf("%s/migfile", tmpfs);
> +    uri = g_strdup_printf("exec:cat > %s", file);
> +
> +    migrate_ensure_converge(from);
> +    migrate_qmp(from, uri, "{}");
> +    wait_for_migration_complete(from);
> +
> +    pid = fork();
> +    if (!pid) {
> +        close(1);
> +        open("/dev/null", O_WRONLY);
> +        execl(python, python, ANALYZE_SCRIPT, "-f", file, NULL);
> +        g_assert_not_reached();
> +    }
> +
> +    g_assert(waitpid(pid, &wstatus, 0) == pid);
> +    if (WIFEXITED(wstatus) && WEXITSTATUS(wstatus) != 0) {
> +        g_test_message("Failed to analyze the migration stream");
> +        g_test_fail();
> +    }
> +    test_migrate_end(from, to, false);
> +    cleanup("migfile");
> +}
> +#endif
> +
>  static void test_precopy_common(MigrateCommon *args)
>  {
>      QTestState *from, *to;
> @@ -2837,6 +2894,9 @@ int main(int argc, char **argv)
>      }
>  
>      qtest_add_func("/migration/bad_dest", test_baddest);
> +#ifndef _WIN32
> +    qtest_add_func("/migration/analyze-script", test_analyze_script);
> +#endif
>      qtest_add_func("/migration/precopy/unix/plain", test_precopy_unix_plain);
>      qtest_add_func("/migration/precopy/unix/xbzrle", test_precopy_unix_xbzrle);
>      /*
> diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
> index 66795cfcd2..d6022ebd64 100644
> --- a/tests/qtest/meson.build
> +++ b/tests/qtest/meson.build
> @@ -357,6 +357,8 @@ foreach dir : target_dirs
>      test_deps += [qsd]
>    endif
>  
> +  qtest_env.set('PYTHON', python.full_path())
> +
>    foreach test : target_qtests
>      # Executables are shared across targets, declare them only the first time we
>      # encounter them

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PULL 10/38] tests/qtest/migration: Add a test for the analyze-migration script
  2024-05-21 12:24   ` Alex Bennée
@ 2024-05-21 12:46     ` Fabiano Rosas
  2024-05-22  5:36       ` Thomas Huth
  2024-05-22 14:11       ` Alex Bennée
  0 siblings, 2 replies; 49+ messages in thread
From: Fabiano Rosas @ 2024-05-21 12:46 UTC (permalink / raw)
  To: Alex Bennée, Juan Quintela
  Cc: qemu-devel, Eric Blake, Thomas Huth, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Laurent Vivier, Richard Henderson

Alex Bennée <alex.bennee@linaro.org> writes:

> Juan Quintela <quintela@redhat.com> writes:
>
>> From: Fabiano Rosas <farosas@suse.de>
>>
>> Add a smoke test that migrates to a file and gives it to the
>> script. It should catch the most annoying errors such as changes in
>> the ram flags.
>>
>> After code has been merged it becomes way harder to figure out what is
>> causing the script to fail, the person making the change is the most
>> likely to know right away what the problem is.
>>
>> Signed-off-by: Fabiano Rosas <farosas@suse.de>
>> Acked-by: Thomas Huth <thuth@redhat.com>
>> Reviewed-by: Juan Quintela <quintela@redhat.com>
>> Signed-off-by: Juan Quintela <quintela@redhat.com>
>> Message-ID: <20231009184326.15777-7-farosas@suse.de>
>
> I bisected the failures I'm seeing on s390x to the introduction of this
> script. I don't know if its simply a timeout on a relatively slow VM:

What's the range of your bisect? That test has been disabled and then
reenabled on s390x. It could be tripping the bisect.

04131e0009 ("tests/qtest/migration-test: Disable the analyze-migration.py test on s390x")
81c2c9dd5d ("tests/qtest/migration-test: Fix analyze-migration.py for s390x")

I don't think that test itself could be timing out. It's a very simple
test. It runs a migration and then uses the output to validate the
script.

I don't have a Z machine at hand and the migration tests only run with
KVM for s390x, so it would be useful to take a look at meson's
testlog.txt so we can see which test is failing and hopefully in what
way it is failing.

If you're up for it, running this in a loop is usually the best way to
catch any intermittent issues:

QTEST_QEMU_BINARY=./qemu-system-x86_64 ./tests/qtest/migration-test

And once you figure out which test, there's this monstrosity:

QTEST_QEMU_BINARY='gdb -q --ex "set pagination off"          \
                          --ex "set print thread-events off" \
                          --ex "handle SIGUSR1 noprint"      \
                          --ex "handle SIGPIPE noprint"      \
                          --ex "run" --ex "quit \$_exitcode" \
                          --args ./qemu-system-x86_64'       \
                          gdb -q --ex "set prompt (qtest) "  \
                          --ex "handle SIGPIPE noprint"      \
                          --args ./tests/qtest/migration-test -p /x86_64/migration/<some>/<test>

> Summary of Failures:
>
>  36/546 qemu:qtest+qtest-s390x / qtest-s390x/migration-test                       ERROR          93.51s   killed by signal 6 SIGABRT
>
> It seems to be unstable as we pass sometimes:
>
> 11:26:42 [ajb@qemu01:~/l/q/b/system] master|… + ./pyvenv/bin/meson test --repeat 100 qtest-s390x/migration-test
> ninja: Entering directory `/home/ajb/lsrc/qemu.git/builds/system'
> [1/9] Generating qemu-version.h with a custom command (wrapped by meson to capture output)
>   1/100 qemu:qtest+qtest-s390x / qtest-s390x/migration-test          ERROR          251.98s   killed by signal 6 SIGABRT
>>>> MALLOC_PERTURB_=9 PYTHON=/home/ajb/lsrc/qemu.git/builds/system/pyvenv/bin/python3 G_TEST_DBUS_DAEMON=/home/ajb/lsrc/qemu.git/tests/dbus-vmstate-daemon.sh QTEST_QEMU_BINARY=./qemu-system-s390x QTEST_QEMU_IMG=./qemu-img QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon /home/ajb/lsrc/qemu.git/builds/system/tests/qtest/migration-test --tap -k
>
>   2/100 qemu:qtest+qtest-s390x / qtest-s390x/migration-test          ERROR          258.71s   killed by signal 6 SIGABRT
>>>> PYTHON=/home/ajb/lsrc/qemu.git/builds/system/pyvenv/bin/python3 MALLOC_PERTURB_=205 G_TEST_DBUS_DAEMON=/home/ajb/lsrc/qemu.git/tests/dbus-vmstate-daemon.sh QTEST_QEMU_BINARY=./qemu-system-s390x QTEST_QEMU_IMG=./qemu-img QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon /home/ajb/lsrc/qemu.git/builds/system/tests/qtest/migration-test --tap -k
>
>   3/100 qemu:qtest+qtest-s390x / qtest-s390x/migration-test          OK             302.53s   46 subtests passed
>   4/100 qemu:qtest+qtest-s390x / qtest-s390x/migration-test          OK             319.56s   46 subtests passed
>   5/100 qemu:qtest+qtest-s390x / qtest-s390x/migration-test          OK             320.11s   46 subtests passed
>   6/100 qemu:qtest+qtest-s390x / qtest-s390x/migration-test          OK             328.40s   46 subtests passed
>
> Ok:                 4   
> Expected Fail:      0   
> Fail:               2   
> Unexpected Pass:    0   
> Skipped:            0   
> Timeout:            0   
>
>> ---
>>  tests/qtest/migration-test.c | 60 ++++++++++++++++++++++++++++++++++++
>>  tests/qtest/meson.build      |  2 ++
>>  2 files changed, 62 insertions(+)
>>
>> diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
>> index 8eb2053dbb..cef5081f8c 100644
>> --- a/tests/qtest/migration-test.c
>> +++ b/tests/qtest/migration-test.c
>> @@ -66,6 +66,8 @@ static bool got_dst_resume;
>>   */
>>  #define DIRTYLIMIT_TOLERANCE_RANGE  25  /* MB/s */
>>  
>> +#define ANALYZE_SCRIPT "scripts/analyze-migration.py"
>> +
>>  #if defined(__linux__)
>>  #include <sys/syscall.h>
>>  #include <sys/vfs.h>
>> @@ -1501,6 +1503,61 @@ static void test_baddest(void)
>>      test_migrate_end(from, to, false);
>>  }
>>  
>> +#ifndef _WIN32
>> +static void test_analyze_script(void)
>> +{
>> +    MigrateStart args = {
>> +        .opts_source = "-uuid 11111111-1111-1111-1111-111111111111",
>> +    };
>> +    QTestState *from, *to;
>> +    g_autofree char *uri = NULL;
>> +    g_autofree char *file = NULL;
>> +    int pid, wstatus;
>> +    const char *python = g_getenv("PYTHON");
>> +
>> +    if (!python) {
>> +        g_test_skip("PYTHON variable not set");
>> +        return;
>> +    }
>> +
>> +    /* dummy url */
>> +    if (test_migrate_start(&from, &to, "tcp:127.0.0.1:0", &args)) {
>> +        return;
>> +    }
>> +
>> +    /*
>> +     * Setting these two capabilities causes the "configuration"
>> +     * vmstate to include subsections for them. The script needs to
>> +     * parse those subsections properly.
>> +     */
>> +    migrate_set_capability(from, "validate-uuid", true);
>> +    migrate_set_capability(from, "x-ignore-shared", true);
>> +
>> +    file = g_strdup_printf("%s/migfile", tmpfs);
>> +    uri = g_strdup_printf("exec:cat > %s", file);
>> +
>> +    migrate_ensure_converge(from);
>> +    migrate_qmp(from, uri, "{}");
>> +    wait_for_migration_complete(from);
>> +
>> +    pid = fork();
>> +    if (!pid) {
>> +        close(1);
>> +        open("/dev/null", O_WRONLY);
>> +        execl(python, python, ANALYZE_SCRIPT, "-f", file, NULL);
>> +        g_assert_not_reached();
>> +    }
>> +
>> +    g_assert(waitpid(pid, &wstatus, 0) == pid);
>> +    if (WIFEXITED(wstatus) && WEXITSTATUS(wstatus) != 0) {
>> +        g_test_message("Failed to analyze the migration stream");
>> +        g_test_fail();
>> +    }
>> +    test_migrate_end(from, to, false);
>> +    cleanup("migfile");
>> +}
>> +#endif
>> +
>>  static void test_precopy_common(MigrateCommon *args)
>>  {
>>      QTestState *from, *to;
>> @@ -2837,6 +2894,9 @@ int main(int argc, char **argv)
>>      }
>>  
>>      qtest_add_func("/migration/bad_dest", test_baddest);
>> +#ifndef _WIN32
>> +    qtest_add_func("/migration/analyze-script", test_analyze_script);
>> +#endif
>>      qtest_add_func("/migration/precopy/unix/plain", test_precopy_unix_plain);
>>      qtest_add_func("/migration/precopy/unix/xbzrle", test_precopy_unix_xbzrle);
>>      /*
>> diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
>> index 66795cfcd2..d6022ebd64 100644
>> --- a/tests/qtest/meson.build
>> +++ b/tests/qtest/meson.build
>> @@ -357,6 +357,8 @@ foreach dir : target_dirs
>>      test_deps += [qsd]
>>    endif
>>  
>> +  qtest_env.set('PYTHON', python.full_path())
>> +
>>    foreach test : target_qtests
>>      # Executables are shared across targets, declare them only the first time we
>>      # encounter them


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PULL 10/38] tests/qtest/migration: Add a test for the analyze-migration script
  2024-05-21 12:46     ` Fabiano Rosas
@ 2024-05-22  5:36       ` Thomas Huth
  2024-05-22 12:48         ` Fabiano Rosas
  2024-05-22 14:11       ` Alex Bennée
  1 sibling, 1 reply; 49+ messages in thread
From: Thomas Huth @ 2024-05-22  5:36 UTC (permalink / raw)
  To: Fabiano Rosas, Alex Bennée, Juan Quintela
  Cc: qemu-devel, Eric Blake, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Laurent Vivier, Richard Henderson

On 21/05/2024 14.46, Fabiano Rosas wrote:
> Alex Bennée <alex.bennee@linaro.org> writes:
> 
>> Juan Quintela <quintela@redhat.com> writes:
>>
>>> From: Fabiano Rosas <farosas@suse.de>
>>>
>>> Add a smoke test that migrates to a file and gives it to the
>>> script. It should catch the most annoying errors such as changes in
>>> the ram flags.
>>>
>>> After code has been merged it becomes way harder to figure out what is
>>> causing the script to fail, the person making the change is the most
>>> likely to know right away what the problem is.
>>>
>>> Signed-off-by: Fabiano Rosas <farosas@suse.de>
>>> Acked-by: Thomas Huth <thuth@redhat.com>
>>> Reviewed-by: Juan Quintela <quintela@redhat.com>
>>> Signed-off-by: Juan Quintela <quintela@redhat.com>
>>> Message-ID: <20231009184326.15777-7-farosas@suse.de>
>>
>> I bisected the failures I'm seeing on s390x to the introduction of this
>> script. I don't know if its simply a timeout on a relatively slow VM:
> 
> What's the range of your bisect? That test has been disabled and then
> reenabled on s390x. It could be tripping the bisect.
> 
> 04131e0009 ("tests/qtest/migration-test: Disable the analyze-migration.py test on s390x")
> 81c2c9dd5d ("tests/qtest/migration-test: Fix analyze-migration.py for s390x")
> 
> I don't think that test itself could be timing out. It's a very simple
> test. It runs a migration and then uses the output to validate the
> script.

Agreed, the analyze-migration.py is unlikely to be the issue - especially 
since it seems to have been disabled again in commit 6f0771de903b ...
Fabiano, why did you disable it here again? The reason is not mentioned in 
the commit description.

But with regards to the failures, please note that we also had a bug 
recently, starting with commit 9d1b0f5bf515a0 and just fixed in commit 
bebe9603fcb072dc ... maybe that affected your bisecting, too.
(it's really bad that this bug sneaked in without being noticed ... we 
should maybe look into running at least some of the migration tests with TCG 
on x86 hosts, too...)

  Thomas



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PULL 10/38] tests/qtest/migration: Add a test for the analyze-migration script
  2024-05-22  5:36       ` Thomas Huth
@ 2024-05-22 12:48         ` Fabiano Rosas
  2024-05-22 13:00           ` Thomas Huth
  0 siblings, 1 reply; 49+ messages in thread
From: Fabiano Rosas @ 2024-05-22 12:48 UTC (permalink / raw)
  To: Thomas Huth, Alex Bennée, Juan Quintela
  Cc: qemu-devel, Eric Blake, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Laurent Vivier, Richard Henderson

Thomas Huth <thuth@redhat.com> writes:

> On 21/05/2024 14.46, Fabiano Rosas wrote:
>> Alex Bennée <alex.bennee@linaro.org> writes:
>> 
>>> Juan Quintela <quintela@redhat.com> writes:
>>>
>>>> From: Fabiano Rosas <farosas@suse.de>
>>>>
>>>> Add a smoke test that migrates to a file and gives it to the
>>>> script. It should catch the most annoying errors such as changes in
>>>> the ram flags.
>>>>
>>>> After code has been merged it becomes way harder to figure out what is
>>>> causing the script to fail, the person making the change is the most
>>>> likely to know right away what the problem is.
>>>>
>>>> Signed-off-by: Fabiano Rosas <farosas@suse.de>
>>>> Acked-by: Thomas Huth <thuth@redhat.com>
>>>> Reviewed-by: Juan Quintela <quintela@redhat.com>
>>>> Signed-off-by: Juan Quintela <quintela@redhat.com>
>>>> Message-ID: <20231009184326.15777-7-farosas@suse.de>
>>>
>>> I bisected the failures I'm seeing on s390x to the introduction of this
>>> script. I don't know if its simply a timeout on a relatively slow VM:
>> 
>> What's the range of your bisect? That test has been disabled and then
>> reenabled on s390x. It could be tripping the bisect.
>> 
>> 04131e0009 ("tests/qtest/migration-test: Disable the analyze-migration.py test on s390x")
>> 81c2c9dd5d ("tests/qtest/migration-test: Fix analyze-migration.py for s390x")
>> 
>> I don't think that test itself could be timing out. It's a very simple
>> test. It runs a migration and then uses the output to validate the
>> script.
>
> Agreed, the analyze-migration.py is unlikely to be the issue - especially 
> since it seems to have been disabled again in commit 6f0771de903b ...
> Fabiano, why did you disable it here again? The reason is not mentioned in 
> the commit description.

Your patch 81c2c9dd5d was merged between my v1 and v2 on the list and I
didn't notice so I messed up the rebase. I'll send a patch soon to fix
that.


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PULL 10/38] tests/qtest/migration: Add a test for the analyze-migration script
  2024-05-22 12:48         ` Fabiano Rosas
@ 2024-05-22 13:00           ` Thomas Huth
  0 siblings, 0 replies; 49+ messages in thread
From: Thomas Huth @ 2024-05-22 13:00 UTC (permalink / raw)
  To: Fabiano Rosas, Alex Bennée, Juan Quintela
  Cc: qemu-devel, Eric Blake, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Laurent Vivier, Richard Henderson

On 22/05/2024 14.48, Fabiano Rosas wrote:
> Thomas Huth <thuth@redhat.com> writes:
> 
>> On 21/05/2024 14.46, Fabiano Rosas wrote:
>>> Alex Bennée <alex.bennee@linaro.org> writes:
>>>
>>>> Juan Quintela <quintela@redhat.com> writes:
>>>>
>>>>> From: Fabiano Rosas <farosas@suse.de>
>>>>>
>>>>> Add a smoke test that migrates to a file and gives it to the
>>>>> script. It should catch the most annoying errors such as changes in
>>>>> the ram flags.
>>>>>
>>>>> After code has been merged it becomes way harder to figure out what is
>>>>> causing the script to fail, the person making the change is the most
>>>>> likely to know right away what the problem is.
>>>>>
>>>>> Signed-off-by: Fabiano Rosas <farosas@suse.de>
>>>>> Acked-by: Thomas Huth <thuth@redhat.com>
>>>>> Reviewed-by: Juan Quintela <quintela@redhat.com>
>>>>> Signed-off-by: Juan Quintela <quintela@redhat.com>
>>>>> Message-ID: <20231009184326.15777-7-farosas@suse.de>
>>>>
>>>> I bisected the failures I'm seeing on s390x to the introduction of this
>>>> script. I don't know if its simply a timeout on a relatively slow VM:
>>>
>>> What's the range of your bisect? That test has been disabled and then
>>> reenabled on s390x. It could be tripping the bisect.
>>>
>>> 04131e0009 ("tests/qtest/migration-test: Disable the analyze-migration.py test on s390x")
>>> 81c2c9dd5d ("tests/qtest/migration-test: Fix analyze-migration.py for s390x")
>>>
>>> I don't think that test itself could be timing out. It's a very simple
>>> test. It runs a migration and then uses the output to validate the
>>> script.
>>
>> Agreed, the analyze-migration.py is unlikely to be the issue - especially
>> since it seems to have been disabled again in commit 6f0771de903b ...
>> Fabiano, why did you disable it here again? The reason is not mentioned in
>> the commit description.
> 
> Your patch 81c2c9dd5d was merged between my v1 and v2 on the list and I
> didn't notice so I messed up the rebase. I'll send a patch soon to fix
> that.

Thanks, but I already sent a patch earlier today that should fix the issue:

 
https://lore.kernel.org/qemu-devel/20240522091255.417263-1-thuth@redhat.com/T/#u

  Thomas



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PULL 10/38] tests/qtest/migration: Add a test for the analyze-migration script
  2024-05-21 12:46     ` Fabiano Rosas
  2024-05-22  5:36       ` Thomas Huth
@ 2024-05-22 14:11       ` Alex Bennée
  1 sibling, 0 replies; 49+ messages in thread
From: Alex Bennée @ 2024-05-22 14:11 UTC (permalink / raw)
  To: Fabiano Rosas
  Cc: Juan Quintela, qemu-devel, Eric Blake, Thomas Huth, Leonardo Bras,
	Vladimir Sementsov-Ogievskiy, Markus Armbruster, Peter Xu,
	Paolo Bonzini, Fam Zheng, Stefan Hajnoczi, Li Zhijian, John Snow,
	qemu-block, Cleber Rosa, Laurent Vivier, Richard Henderson

Fabiano Rosas <farosas@suse.de> writes:

> Alex Bennée <alex.bennee@linaro.org> writes:
>
>> Juan Quintela <quintela@redhat.com> writes:
>>
>>> From: Fabiano Rosas <farosas@suse.de>
>>>
>>> Add a smoke test that migrates to a file and gives it to the
>>> script. It should catch the most annoying errors such as changes in
>>> the ram flags.
>>>
>>> After code has been merged it becomes way harder to figure out what is
>>> causing the script to fail, the person making the change is the most
>>> likely to know right away what the problem is.
>>>
>>> Signed-off-by: Fabiano Rosas <farosas@suse.de>
>>> Acked-by: Thomas Huth <thuth@redhat.com>
>>> Reviewed-by: Juan Quintela <quintela@redhat.com>
>>> Signed-off-by: Juan Quintela <quintela@redhat.com>
>>> Message-ID: <20231009184326.15777-7-farosas@suse.de>
>>
>> I bisected the failures I'm seeing on s390x to the introduction of this
>> script. I don't know if its simply a timeout on a relatively slow VM:
>
> What's the range of your bisect? That test has been disabled and then
> reenabled on s390x. It could be tripping the bisect.
>
> 04131e0009 ("tests/qtest/migration-test: Disable the analyze-migration.py test on s390x")
> 81c2c9dd5d ("tests/qtest/migration-test: Fix analyze-migration.py for
> s390x")

I ran between v8.1.0 and master.

But it is still failing @ HEAD 01782d6b294f95bcde334386f0aaac593cd28c0d
as you can see from the run I did at the end.

>
> I don't think that test itself could be timing out. It's a very simple
> test. It runs a migration and then uses the output to validate the
> script.
>
> I don't have a Z machine at hand and the migration tests only run with
> KVM for s390x, so it would be useful to take a look at meson's
> testlog.txt so we can see which test is failing and hopefully in what
> way it is failing.
>
> If you're up for it, running this in a loop is usually the best way to
> catch any intermittent issues:
>
> QTEST_QEMU_BINARY=./qemu-system-x86_64 ./tests/qtest/migration-test
>
> And once you figure out which test, there's this monstrosity:
>
> QTEST_QEMU_BINARY='gdb -q --ex "set pagination off"          \
>                           --ex "set print thread-events off" \
>                           --ex "handle SIGUSR1 noprint"      \
>                           --ex "handle SIGPIPE noprint"      \
>                           --ex "run" --ex "quit \$_exitcode" \
>                           --args ./qemu-system-x86_64'       \
>                           gdb -q --ex "set prompt (qtest) "  \
>                           --ex "handle SIGPIPE noprint"      \
>                           --args ./tests/qtest/migration-test -p /x86_64/migration/<some>/<test>
>
>> Summary of Failures:
>>
>>  36/546 qemu:qtest+qtest-s390x / qtest-s390x/migration-test                       ERROR          93.51s   killed by signal 6 SIGABRT
>>
>> It seems to be unstable as we pass sometimes:
>>
>> 11:26:42 [ajb@qemu01:~/l/q/b/system] master|… + ./pyvenv/bin/meson test --repeat 100 qtest-s390x/migration-test
>> ninja: Entering directory `/home/ajb/lsrc/qemu.git/builds/system'
>> [1/9] Generating qemu-version.h with a custom command (wrapped by meson to capture output)
>>   1/100 qemu:qtest+qtest-s390x / qtest-s390x/migration-test          ERROR          251.98s   killed by signal 6 SIGABRT
>>>>> MALLOC_PERTURB_=9
>>>>> PYTHON=/home/ajb/lsrc/qemu.git/builds/system/pyvenv/bin/python3
>>>>> G_TEST_DBUS_DAEMON=/home/ajb/lsrc/qemu.git/tests/dbus-vmstate-daemon.sh
>>>>> QTEST_QEMU_BINARY=./qemu-system-s390x QTEST_QEMU_IMG=./qemu-img
>>>>> QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon
>>>>> /home/ajb/lsrc/qemu.git/builds/system/tests/qtest/migration-test
>>>>> --tap -k
>>
>>   2/100 qemu:qtest+qtest-s390x / qtest-s390x/migration-test          ERROR          258.71s   killed by signal 6 SIGABRT
>>>>> PYTHON=/home/ajb/lsrc/qemu.git/builds/system/pyvenv/bin/python3
>>>>> MALLOC_PERTURB_=205
>>>>> G_TEST_DBUS_DAEMON=/home/ajb/lsrc/qemu.git/tests/dbus-vmstate-daemon.sh
>>>>> QTEST_QEMU_BINARY=./qemu-system-s390x QTEST_QEMU_IMG=./qemu-img
>>>>> QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon
>>>>> /home/ajb/lsrc/qemu.git/builds/system/tests/qtest/migration-test
>>>>> --tap -k
>>
>>   3/100 qemu:qtest+qtest-s390x / qtest-s390x/migration-test          OK             302.53s   46 subtests passed
>>   4/100 qemu:qtest+qtest-s390x / qtest-s390x/migration-test          OK             319.56s   46 subtests passed
>>   5/100 qemu:qtest+qtest-s390x / qtest-s390x/migration-test          OK             320.11s   46 subtests passed
>>   6/100 qemu:qtest+qtest-s390x / qtest-s390x/migration-test          OK             328.40s   46 subtests passed
>>
>> Ok:                 4   
>> Expected Fail:      0   
>> Fail:               2   
>> Unexpected Pass:    0   
>> Skipped:            0   
>> Timeout:            0   
>>
>>> ---
>>>  tests/qtest/migration-test.c | 60 ++++++++++++++++++++++++++++++++++++
>>>  tests/qtest/meson.build      |  2 ++
>>>  2 files changed, 62 insertions(+)
>>>
>>> diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
>>> index 8eb2053dbb..cef5081f8c 100644
>>> --- a/tests/qtest/migration-test.c
>>> +++ b/tests/qtest/migration-test.c
>>> @@ -66,6 +66,8 @@ static bool got_dst_resume;
>>>   */
>>>  #define DIRTYLIMIT_TOLERANCE_RANGE  25  /* MB/s */
>>>  
>>> +#define ANALYZE_SCRIPT "scripts/analyze-migration.py"
>>> +
>>>  #if defined(__linux__)
>>>  #include <sys/syscall.h>
>>>  #include <sys/vfs.h>
>>> @@ -1501,6 +1503,61 @@ static void test_baddest(void)
>>>      test_migrate_end(from, to, false);
>>>  }
>>>  
>>> +#ifndef _WIN32
>>> +static void test_analyze_script(void)
>>> +{
>>> +    MigrateStart args = {
>>> +        .opts_source = "-uuid 11111111-1111-1111-1111-111111111111",
>>> +    };
>>> +    QTestState *from, *to;
>>> +    g_autofree char *uri = NULL;
>>> +    g_autofree char *file = NULL;
>>> +    int pid, wstatus;
>>> +    const char *python = g_getenv("PYTHON");
>>> +
>>> +    if (!python) {
>>> +        g_test_skip("PYTHON variable not set");
>>> +        return;
>>> +    }
>>> +
>>> +    /* dummy url */
>>> +    if (test_migrate_start(&from, &to, "tcp:127.0.0.1:0", &args)) {
>>> +        return;
>>> +    }
>>> +
>>> +    /*
>>> +     * Setting these two capabilities causes the "configuration"
>>> +     * vmstate to include subsections for them. The script needs to
>>> +     * parse those subsections properly.
>>> +     */
>>> +    migrate_set_capability(from, "validate-uuid", true);
>>> +    migrate_set_capability(from, "x-ignore-shared", true);
>>> +
>>> +    file = g_strdup_printf("%s/migfile", tmpfs);
>>> +    uri = g_strdup_printf("exec:cat > %s", file);
>>> +
>>> +    migrate_ensure_converge(from);
>>> +    migrate_qmp(from, uri, "{}");
>>> +    wait_for_migration_complete(from);
>>> +
>>> +    pid = fork();
>>> +    if (!pid) {
>>> +        close(1);
>>> +        open("/dev/null", O_WRONLY);
>>> +        execl(python, python, ANALYZE_SCRIPT, "-f", file, NULL);
>>> +        g_assert_not_reached();
>>> +    }
>>> +
>>> +    g_assert(waitpid(pid, &wstatus, 0) == pid);
>>> +    if (WIFEXITED(wstatus) && WEXITSTATUS(wstatus) != 0) {
>>> +        g_test_message("Failed to analyze the migration stream");
>>> +        g_test_fail();
>>> +    }
>>> +    test_migrate_end(from, to, false);
>>> +    cleanup("migfile");
>>> +}
>>> +#endif
>>> +
>>>  static void test_precopy_common(MigrateCommon *args)
>>>  {
>>>      QTestState *from, *to;
>>> @@ -2837,6 +2894,9 @@ int main(int argc, char **argv)
>>>      }
>>>  
>>>      qtest_add_func("/migration/bad_dest", test_baddest);
>>> +#ifndef _WIN32
>>> +    qtest_add_func("/migration/analyze-script", test_analyze_script);
>>> +#endif
>>>      qtest_add_func("/migration/precopy/unix/plain", test_precopy_unix_plain);
>>>      qtest_add_func("/migration/precopy/unix/xbzrle", test_precopy_unix_xbzrle);
>>>      /*
>>> diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
>>> index 66795cfcd2..d6022ebd64 100644
>>> --- a/tests/qtest/meson.build
>>> +++ b/tests/qtest/meson.build
>>> @@ -357,6 +357,8 @@ foreach dir : target_dirs
>>>      test_deps += [qsd]
>>>    endif
>>>  
>>> +  qtest_env.set('PYTHON', python.full_path())
>>> +
>>>    foreach test : target_qtests
>>>      # Executables are shared across targets, declare them only the first time we
>>>      # encounter them

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro


^ permalink raw reply	[flat|nested] 49+ messages in thread

end of thread, other threads:[~2024-05-22 14:12 UTC | newest]

Thread overview: 49+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-17  8:29 [PULL 00/38] Migration 20231017 patches Juan Quintela
2023-10-17  8:29 ` [PULL 01/38] migration: refactor migration_completion Juan Quintela
2023-10-17  8:29 ` [PULL 02/38] migration: Use g_autofree to simplify ram_dirty_bitmap_reload() Juan Quintela
2023-10-17  8:29 ` [PULL 03/38] migration: Allow user to specify available switchover bandwidth Juan Quintela
2023-10-17  8:29 ` [PULL 04/38] migration: fix RAMBlock add NULL check Juan Quintela
2023-10-17  8:29 ` [PULL 05/38] migration: Add the configuration vmstate to the json writer Juan Quintela
2023-10-17  8:29 ` [PULL 06/38] migration: Fix analyze-migration.py 'configuration' parsing Juan Quintela
2023-10-17  8:29 ` [PULL 07/38] migration: Add capability parsing to analyze-migration.py Juan Quintela
2023-10-17  8:29 ` [PULL 08/38] migration: Fix analyze-migration.py when ignore-shared is used Juan Quintela
2023-10-17  8:29 ` [PULL 09/38] migration: Fix analyze-migration read operation signedness Juan Quintela
2023-10-17  8:29 ` [PULL 10/38] tests/qtest/migration: Add a test for the analyze-migration script Juan Quintela
2024-05-21 12:24   ` Alex Bennée
2024-05-21 12:46     ` Fabiano Rosas
2024-05-22  5:36       ` Thomas Huth
2024-05-22 12:48         ` Fabiano Rosas
2024-05-22 13:00           ` Thomas Huth
2024-05-22 14:11       ` Alex Bennée
2023-10-17  8:29 ` [PULL 11/38] tests/qtest: migration-test: Add tests for file-based migration Juan Quintela
2023-10-17  8:29 ` [PULL 12/38] migration: hold the BQL during setup Juan Quintela
2023-10-17  8:29 ` [PULL 13/38] migration: Non multifd migration don't care about multifd flushes Juan Quintela
2023-10-19 11:47   ` Michael Tokarev
2023-10-19 12:03     ` Juan Quintela
2023-10-17  8:29 ` [PULL 14/38] migration: Create migrate_rdma() Juan Quintela
2023-10-17  8:29 ` [PULL 15/38] migration/rdma: Unfold ram_control_before_iterate() Juan Quintela
2023-10-17  8:29 ` [PULL 16/38] migration/rdma: Unfold ram_control_after_iterate() Juan Quintela
2023-10-17  8:29 ` [PULL 17/38] migration/rdma: Remove all uses of RAM_CONTROL_HOOK Juan Quintela
2023-10-17  8:29 ` [PULL 18/38] migration/rdma: Unfold hook_ram_load() Juan Quintela
2023-10-17  8:29 ` [PULL 19/38] migration/rdma: Create rdma_control_save_page() Juan Quintela
2023-10-17  8:29 ` [PULL 20/38] qemu-file: Remove QEMUFileHooks Juan Quintela
2023-10-17  8:29 ` [PULL 21/38] migration/rdma: Move rdma constants from qemu-file.h to rdma.h Juan Quintela
2023-10-17  8:29 ` [PULL 22/38] migration/rdma: Remove qemu_ prefix from exported functions Juan Quintela
2023-10-17  8:29 ` [PULL 23/38] migration/rdma: Check sooner if we are in postcopy for save_page() Juan Quintela
2023-10-17  8:29 ` [PULL 24/38] migration/rdma: Use i as for index instead of idx Juan Quintela
2023-10-17  8:29 ` [PULL 25/38] migration/rdma: Declare for index variables local Juan Quintela
2023-10-17  8:29 ` [PULL 26/38] migration/rdma: Remove all "ret" variables that are used only once Juan Quintela
2023-10-17  8:29 ` [PULL 27/38] migration: Improve json and formatting Juan Quintela
2023-10-17  8:29 ` [PULL 28/38] migration: check for rate_limit_max for RATE_LIMIT_DISABLED Juan Quintela
2023-10-17  8:29 ` [PULL 29/38] multifd: fix counters in multifd_send_thread Juan Quintela
2023-10-17  8:29 ` [PULL 30/38] multifd: reset next_packet_len after sending pages Juan Quintela
2023-10-17  8:29 ` [PULL 31/38] migration/ram: Refactor precopy ram loading code Juan Quintela
2023-10-17  8:29 ` [PULL 32/38] migration/ram: Remove RAMState from xbzrle_cache_zero_page Juan Quintela
2023-10-17  8:29 ` [PULL 33/38] migration/ram: Stop passing QEMUFile around in save_zero_page Juan Quintela
2023-10-17  8:29 ` [PULL 34/38] migration/ram: Move xbzrle zero page handling into save_zero_page Juan Quintela
2023-10-17  8:30 ` [PULL 35/38] migration/ram: Merge save_zero_page functions Juan Quintela
2023-10-17  8:30 ` [PULL 36/38] migration/multifd: Remove direct "socket" references Juan Quintela
2023-10-17  8:30 ` [PULL 37/38] migration/multifd: Unify multifd_send_thread error paths Juan Quintela
2023-10-17  8:30 ` [PULL 38/38] migration/multifd: Clarify Error usage in multifd_channel_connect Juan Quintela
2023-10-17 19:04 ` [PULL 00/38] Migration 20231017 patches Stefan Hajnoczi
  -- strict thread matches above, loose matches on Subject: below --
2023-10-16 10:06 [PULL 00/38] Migration 20231016 patches Juan Quintela
2023-10-16 10:06 ` [PULL 10/38] tests/qtest/migration: Add a test for the analyze-migration script Juan Quintela

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).