qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/26] Migration PULL 2023-07-24
@ 2023-07-24 13:06 Juan Quintela
  2023-07-24 13:06 ` [PATCH 01/26] migration/multifd: Rename threadinfo.c functions Juan Quintela
                   ` (26 more replies)
  0 siblings, 27 replies; 30+ messages in thread
From: Juan Quintela @ 2023-07-24 13:06 UTC (permalink / raw)
  To: qemu-devel
  Cc: Laurent Vivier, Thomas Huth, Markus Armbruster, libvir-list,
	Paolo Bonzini, Peter Xu, Eric Blake, Juan Quintela, Leonardo Bras

Hi

This is the migration PULL request.  It has:
- Fabiano rosas trheadinfo cleanups
- Hyman Huang dirtylimit changes
- Part of my changes
- Peter Xu documentation
- Tejus updato to migration descriptions
- Wei want improvements for postocpy and multifd setup

Please apply.

Now a not on CI, thas has been really bad.  After too many problems
with last PULLS, I decided to learn to use qemu CI.  On one hand, it
is not so difficult, even I can use it O:-)

On the other hand, the amount of problems that I got is inmense.  Some
of them dissapear when I rerun the checks, but I never know if it is
my PULL request, the CI system or the tests themselves.

So it ends going something like:

while (true); do
- git pull
- git rebase
- git push ci blah, blah
- Next day cames, and too many errors, so I rebase again

The last step takes more time than expected and not always trivial to
know how the failure is.

This (last) patch is not part of the PULL request, but I have found
that it _always_ makes gcov fail.  I had to use bisect to find where
the problem was.

https://gitlab.com/juan.quintela/qemu/-/jobs/4571878922

I could use help to know how a change in test/qtest/migration-test.c
can break block layer tests, I am all ears.

Yes, I tried several times.  It always fails on that patch.  The
passes with flying colors.

Later, Juan.

Fabiano Rosas (2):
  migration/multifd: Rename threadinfo.c functions
  migration/multifd: Protect accesses to migration_threads

Hyman Huang(黄勇) (8):
  softmmu/dirtylimit: Add parameter check for hmp "set_vcpu_dirty_limit"
  qapi/migration: Introduce x-vcpu-dirty-limit-period parameter
  qapi/migration: Introduce vcpu-dirty-limit parameters
  migration: Introduce dirty-limit capability
  migration: Refactor auto-converge capability logic
  migration: Put the detection logic before auto-converge checking
  migration: Implement dirty-limit convergence algo
  migration: Extend query-migrate to provide dirty page limit info

Juan Quintela (12):
  migration-test: Be consistent for ppc
  migration-test: Make machine_opts regular with other options
  migration-test: Create arch_opts
  migration-test: machine_opts is really arch specific
  migration.json: Don't use space before colon
  migration: skipped field is really obsolete.
  qemu-file: Rename qemu_file_transferred_ fast -> noflush
  migration: Change qemu_file_transferred to noflush
  qemu_file: Make qemu_file_is_writable() static
  qemu-file: Simplify qemu_file_shutdown()
  qemu-file: Make qemu_file_get_error_obj() static
  migration/rdma: Split qemu_fopen_rdma() into input/output functions

Peter Xu (1):
  docs/migration: Update postcopy bits

Tejus GK (1):
  migration: Update error description whenever migration fails

Wei Wang (2):
  migration: enforce multifd and postcopy preempt to be set before
    incoming
  qtest/migration-tests.c: use "-incoming defer" for postcopy tests

 docs/about/deprecated.rst      |  10 +++
 docs/devel/migration.rst       |  94 ++++++++++++++++++++---------
 qapi/migration.json            | 107 ++++++++++++++++++++++++++-------
 include/sysemu/dirtylimit.h    |   2 +
 migration/options.h            |   1 +
 migration/qemu-file.h          |  14 ++---
 migration/threadinfo.h         |   7 +--
 migration/migration-hmp-cmds.c |  26 ++++++++
 migration/migration.c          |  36 ++++++++---
 migration/multifd.c            |   4 +-
 migration/options.c            |  87 ++++++++++++++++++++++++++-
 migration/qemu-file.c          |  24 ++------
 migration/ram.c                |  59 +++++++++++++++---
 migration/rdma.c               |  39 ++++++------
 migration/savevm.c             |   6 +-
 migration/threadinfo.c         |  19 +++++-
 migration/vmstate.c            |   4 +-
 softmmu/dirtylimit.c           |  97 +++++++++++++++++++++++++++---
 tests/qtest/migration-test.c   |  48 +++++++--------
 migration/trace-events         |   1 +
 20 files changed, 520 insertions(+), 165 deletions(-)

-- 
2.40.1



^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 01/26] migration/multifd: Rename threadinfo.c functions
  2023-07-24 13:06 [PATCH 00/26] Migration PULL 2023-07-24 Juan Quintela
@ 2023-07-24 13:06 ` Juan Quintela
  2023-07-24 13:06 ` [PATCH 02/26] migration/multifd: Protect accesses to migration_threads Juan Quintela
                   ` (25 subsequent siblings)
  26 siblings, 0 replies; 30+ messages in thread
From: Juan Quintela @ 2023-07-24 13:06 UTC (permalink / raw)
  To: qemu-devel
  Cc: Laurent Vivier, Thomas Huth, Markus Armbruster, libvir-list,
	Paolo Bonzini, Peter Xu, Eric Blake, Juan Quintela, Leonardo Bras,
	Fabiano Rosas, Philippe Mathieu-Daudé

From: Fabiano Rosas <farosas@suse.de>

We're about to add more functions to this file so make it use the same
coding style as the rest of the code.

Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20230607161306.31425-2-farosas@suse.de>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration/threadinfo.h | 5 ++---
 migration/migration.c  | 4 ++--
 migration/multifd.c    | 4 ++--
 migration/threadinfo.c | 4 ++--
 4 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/migration/threadinfo.h b/migration/threadinfo.h
index 4d69423c0a..8aa6999d58 100644
--- a/migration/threadinfo.h
+++ b/migration/threadinfo.h
@@ -23,6 +23,5 @@ struct MigrationThread {
     QLIST_ENTRY(MigrationThread) node;
 };
 
-MigrationThread *MigrationThreadAdd(const char *name, int thread_id);
-
-void MigrationThreadDel(MigrationThread *info);
+MigrationThread *migration_threads_add(const char *name, int thread_id);
+void migration_threads_remove(MigrationThread *info);
diff --git a/migration/migration.c b/migration/migration.c
index 91bba630a8..ae49d42eab 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2953,7 +2953,7 @@ static void *migration_thread(void *opaque)
     MigThrError thr_error;
     bool urgent = false;
 
-    thread = MigrationThreadAdd("live_migration", qemu_get_thread_id());
+    thread = migration_threads_add("live_migration", qemu_get_thread_id());
 
     rcu_register_thread();
 
@@ -3031,7 +3031,7 @@ static void *migration_thread(void *opaque)
     migration_iteration_finish(s);
     object_unref(OBJECT(s));
     rcu_unregister_thread();
-    MigrationThreadDel(thread);
+    migration_threads_remove(thread);
     return NULL;
 }
 
diff --git a/migration/multifd.c b/migration/multifd.c
index 3387d8277f..4c6cee6547 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -651,7 +651,7 @@ static void *multifd_send_thread(void *opaque)
     int ret = 0;
     bool use_zero_copy_send = migrate_zero_copy_send();
 
-    thread = MigrationThreadAdd(p->name, qemu_get_thread_id());
+    thread = migration_threads_add(p->name, qemu_get_thread_id());
 
     trace_multifd_send_thread_start(p->id);
     rcu_register_thread();
@@ -767,7 +767,7 @@ out:
     qemu_mutex_unlock(&p->mutex);
 
     rcu_unregister_thread();
-    MigrationThreadDel(thread);
+    migration_threads_remove(thread);
     trace_multifd_send_thread_end(p->id, p->num_packets, p->total_normal_pages);
 
     return NULL;
diff --git a/migration/threadinfo.c b/migration/threadinfo.c
index 1de8b31855..3dd9b14ae6 100644
--- a/migration/threadinfo.c
+++ b/migration/threadinfo.c
@@ -14,7 +14,7 @@
 
 static QLIST_HEAD(, MigrationThread) migration_threads;
 
-MigrationThread *MigrationThreadAdd(const char *name, int thread_id)
+MigrationThread *migration_threads_add(const char *name, int thread_id)
 {
     MigrationThread *thread =  g_new0(MigrationThread, 1);
     thread->name = name;
@@ -25,7 +25,7 @@ MigrationThread *MigrationThreadAdd(const char *name, int thread_id)
     return thread;
 }
 
-void MigrationThreadDel(MigrationThread *thread)
+void migration_threads_remove(MigrationThread *thread)
 {
     if (thread) {
         QLIST_REMOVE(thread, node);
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 02/26] migration/multifd: Protect accesses to migration_threads
  2023-07-24 13:06 [PATCH 00/26] Migration PULL 2023-07-24 Juan Quintela
  2023-07-24 13:06 ` [PATCH 01/26] migration/multifd: Rename threadinfo.c functions Juan Quintela
@ 2023-07-24 13:06 ` Juan Quintela
  2023-07-24 13:29   ` Fabiano Rosas
  2023-07-24 13:06 ` [PATCH 03/26] softmmu/dirtylimit: Add parameter check for hmp "set_vcpu_dirty_limit" Juan Quintela
                   ` (24 subsequent siblings)
  26 siblings, 1 reply; 30+ messages in thread
From: Juan Quintela @ 2023-07-24 13:06 UTC (permalink / raw)
  To: qemu-devel
  Cc: Laurent Vivier, Thomas Huth, Markus Armbruster, libvir-list,
	Paolo Bonzini, Peter Xu, Eric Blake, Juan Quintela, Leonardo Bras,
	Fabiano Rosas

From: Fabiano Rosas <farosas@suse.de>

This doubly linked list is common for all the multifd and migration
threads so we need to avoid concurrent access.

Add a mutex to protect the data from concurrent access. This fixes a
crash when removing two MigrationThread objects from the list at the
same time during cleanup of multifd threads.

Fixes: 671326201d ("migration: Introduce interface query-migrationthreads")
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20230607161306.31425-3-farosas@suse.de>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration/threadinfo.h |  2 --
 migration/threadinfo.c | 15 ++++++++++++++-
 2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/migration/threadinfo.h b/migration/threadinfo.h
index 8aa6999d58..2f356ff312 100644
--- a/migration/threadinfo.h
+++ b/migration/threadinfo.h
@@ -10,8 +10,6 @@
  *  See the COPYING file in the top-level directory.
  */
 
-#include "qemu/queue.h"
-#include "qemu/osdep.h"
 #include "qapi/error.h"
 #include "qapi/qapi-commands-migration.h"
 
diff --git a/migration/threadinfo.c b/migration/threadinfo.c
index 3dd9b14ae6..262990dd75 100644
--- a/migration/threadinfo.c
+++ b/migration/threadinfo.c
@@ -10,23 +10,35 @@
  *  See the COPYING file in the top-level directory.
  */
 
+#include "qemu/osdep.h"
+#include "qemu/queue.h"
+#include "qemu/lockable.h"
 #include "threadinfo.h"
 
+QemuMutex migration_threads_lock;
 static QLIST_HEAD(, MigrationThread) migration_threads;
 
+static void __attribute__((constructor)) migration_threads_init(void)
+{
+    qemu_mutex_init(&migration_threads_lock);
+}
+
 MigrationThread *migration_threads_add(const char *name, int thread_id)
 {
     MigrationThread *thread =  g_new0(MigrationThread, 1);
     thread->name = name;
     thread->thread_id = thread_id;
 
-    QLIST_INSERT_HEAD(&migration_threads, thread, node);
+    WITH_QEMU_LOCK_GUARD(&migration_threads_lock) {
+        QLIST_INSERT_HEAD(&migration_threads, thread, node);
+    }
 
     return thread;
 }
 
 void migration_threads_remove(MigrationThread *thread)
 {
+    QEMU_LOCK_GUARD(&migration_threads_lock);
     if (thread) {
         QLIST_REMOVE(thread, node);
         g_free(thread);
@@ -39,6 +51,7 @@ MigrationThreadInfoList *qmp_query_migrationthreads(Error **errp)
     MigrationThreadInfoList **tail = &head;
     MigrationThread *thread = NULL;
 
+    QEMU_LOCK_GUARD(&migration_threads_lock);
     QLIST_FOREACH(thread, &migration_threads, node) {
         MigrationThreadInfo *info = g_new0(MigrationThreadInfo, 1);
         info->name = g_strdup(thread->name);
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 03/26] softmmu/dirtylimit: Add parameter check for hmp "set_vcpu_dirty_limit"
  2023-07-24 13:06 [PATCH 00/26] Migration PULL 2023-07-24 Juan Quintela
  2023-07-24 13:06 ` [PATCH 01/26] migration/multifd: Rename threadinfo.c functions Juan Quintela
  2023-07-24 13:06 ` [PATCH 02/26] migration/multifd: Protect accesses to migration_threads Juan Quintela
@ 2023-07-24 13:06 ` Juan Quintela
  2023-07-24 13:06 ` [PATCH 04/26] qapi/migration: Introduce x-vcpu-dirty-limit-period parameter Juan Quintela
                   ` (23 subsequent siblings)
  26 siblings, 0 replies; 30+ messages in thread
From: Juan Quintela @ 2023-07-24 13:06 UTC (permalink / raw)
  To: qemu-devel
  Cc: Laurent Vivier, Thomas Huth, Markus Armbruster, libvir-list,
	Paolo Bonzini, Peter Xu, Eric Blake, Juan Quintela, Leonardo Bras,
	Hyman Huang(黄勇)

From: Hyman Huang(黄勇) <yong.huang@smartx.com>

dirty_rate paraemter of hmp command "set_vcpu_dirty_limit" is invalid
if less than 0, so add parameter check for it.

Note that this patch also delete the unsolicited help message and
clean up the code.

Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <168618975839.6361.17407633874747688653-1@git.sr.ht>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 softmmu/dirtylimit.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/softmmu/dirtylimit.c b/softmmu/dirtylimit.c
index 015a9038d1..e80201097a 100644
--- a/softmmu/dirtylimit.c
+++ b/softmmu/dirtylimit.c
@@ -515,14 +515,15 @@ void hmp_set_vcpu_dirty_limit(Monitor *mon, const QDict *qdict)
     int64_t cpu_index = qdict_get_try_int(qdict, "cpu_index", -1);
     Error *err = NULL;
 
+    if (dirty_rate < 0) {
+        error_setg(&err, "invalid dirty page limit %" PRId64, dirty_rate);
+        goto out;
+    }
+
     qmp_set_vcpu_dirty_limit(!!(cpu_index != -1), cpu_index, dirty_rate, &err);
-    if (err) {
-        hmp_handle_error(mon, err);
-        return;
-    }
 
-    monitor_printf(mon, "[Please use 'info vcpu_dirty_limit' to query "
-                   "dirty limit for virtual CPU]\n");
+out:
+    hmp_handle_error(mon, err);
 }
 
 static struct DirtyLimitInfo *dirtylimit_query_vcpu(int cpu_index)
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 04/26] qapi/migration: Introduce x-vcpu-dirty-limit-period parameter
  2023-07-24 13:06 [PATCH 00/26] Migration PULL 2023-07-24 Juan Quintela
                   ` (2 preceding siblings ...)
  2023-07-24 13:06 ` [PATCH 03/26] softmmu/dirtylimit: Add parameter check for hmp "set_vcpu_dirty_limit" Juan Quintela
@ 2023-07-24 13:06 ` Juan Quintela
  2023-07-24 13:06 ` [PATCH 05/26] qapi/migration: Introduce vcpu-dirty-limit parameters Juan Quintela
                   ` (22 subsequent siblings)
  26 siblings, 0 replies; 30+ messages in thread
From: Juan Quintela @ 2023-07-24 13:06 UTC (permalink / raw)
  To: qemu-devel
  Cc: Laurent Vivier, Thomas Huth, Markus Armbruster, libvir-list,
	Paolo Bonzini, Peter Xu, Eric Blake, Juan Quintela, Leonardo Bras,
	Hyman Huang(黄勇)

From: Hyman Huang(黄勇) <yong.huang@smartx.com>

Introduce "x-vcpu-dirty-limit-period" migration experimental
parameter, which is in the range of 1 to 1000ms and used to
make dirtyrate calculation period configurable.

Currently with the "x-vcpu-dirty-limit-period" varies, the
total time of live migration changes, test results show the
optimal value of "x-vcpu-dirty-limit-period" ranges from
500ms to 1000 ms. "x-vcpu-dirty-limit-period" should be made
stable once it proves best value can not be determined with
developer's experiments.

Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <168618975839.6361.17407633874747688653-2@git.sr.ht>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 qapi/migration.json            | 34 +++++++++++++++++++++++++++-------
 migration/migration-hmp-cmds.c |  8 ++++++++
 migration/options.c            | 28 ++++++++++++++++++++++++++++
 3 files changed, 63 insertions(+), 7 deletions(-)

diff --git a/qapi/migration.json b/qapi/migration.json
index 47dfef0278..384b768e03 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -789,9 +789,14 @@
 #     Nodes are mapped to their block device name if there is one, and
 #     to their node name otherwise.  (Since 5.2)
 #
+# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty limit during
+#                             live migration. Should be in the range 1 to 1000ms,
+#                             defaults to 1000ms. (Since 8.1)
+#
 # Features:
 #
-# @unstable: Member @x-checkpoint-delay is experimental.
+# @unstable: Members @x-checkpoint-delay and @x-vcpu-dirty-limit-period
+#            are experimental.
 #
 # Since: 2.4
 ##
@@ -809,8 +814,9 @@
            'multifd-channels',
            'xbzrle-cache-size', 'max-postcopy-bandwidth',
            'max-cpu-throttle', 'multifd-compression',
-           'multifd-zlib-level' ,'multifd-zstd-level',
-           'block-bitmap-mapping' ] }
+           'multifd-zlib-level', 'multifd-zstd-level',
+           'block-bitmap-mapping',
+           { 'name': 'x-vcpu-dirty-limit-period', 'features': ['unstable'] } ] }
 
 ##
 # @MigrateSetParameters:
@@ -945,9 +951,14 @@
 #     Nodes are mapped to their block device name if there is one, and
 #     to their node name otherwise.  (Since 5.2)
 #
+# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty limit during
+#                             live migration. Should be in the range 1 to 1000ms,
+#                             defaults to 1000ms. (Since 8.1)
+#
 # Features:
 #
-# @unstable: Member @x-checkpoint-delay is experimental.
+# @unstable: Members @x-checkpoint-delay and @x-vcpu-dirty-limit-period
+#            are experimental.
 #
 # TODO: either fuse back into MigrationParameters, or make
 #     MigrationParameters members mandatory
@@ -982,7 +993,9 @@
             '*multifd-compression': 'MultiFDCompression',
             '*multifd-zlib-level': 'uint8',
             '*multifd-zstd-level': 'uint8',
-            '*block-bitmap-mapping': [ 'BitmapMigrationNodeAlias' ] } }
+            '*block-bitmap-mapping': [ 'BitmapMigrationNodeAlias' ],
+            '*x-vcpu-dirty-limit-period': { 'type': 'uint64',
+                                            'features': [ 'unstable' ] } } }
 
 ##
 # @migrate-set-parameters:
@@ -1137,9 +1150,14 @@
 #     Nodes are mapped to their block device name if there is one, and
 #     to their node name otherwise.  (Since 5.2)
 #
+# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty limit during
+#                             live migration. Should be in the range 1 to 1000ms,
+#                             defaults to 1000ms. (Since 8.1)
+#
 # Features:
 #
-# @unstable: Member @x-checkpoint-delay is experimental.
+# @unstable: Members @x-checkpoint-delay and @x-vcpu-dirty-limit-period
+#            are experimental.
 #
 # Since: 2.4
 ##
@@ -1171,7 +1189,9 @@
             '*multifd-compression': 'MultiFDCompression',
             '*multifd-zlib-level': 'uint8',
             '*multifd-zstd-level': 'uint8',
-            '*block-bitmap-mapping': [ 'BitmapMigrationNodeAlias' ] } }
+            '*block-bitmap-mapping': [ 'BitmapMigrationNodeAlias' ],
+            '*x-vcpu-dirty-limit-period': { 'type': 'uint64',
+                                            'features': [ 'unstable' ] } } }
 
 ##
 # @query-migrate-parameters:
diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c
index 9885d7c9f7..352e9ec716 100644
--- a/migration/migration-hmp-cmds.c
+++ b/migration/migration-hmp-cmds.c
@@ -364,6 +364,10 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict)
                 }
             }
         }
+
+        monitor_printf(mon, "%s: %" PRIu64 " ms\n",
+        MigrationParameter_str(MIGRATION_PARAMETER_X_VCPU_DIRTY_LIMIT_PERIOD),
+        params->x_vcpu_dirty_limit_period);
     }
 
     qapi_free_MigrationParameters(params);
@@ -620,6 +624,10 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict)
         error_setg(&err, "The block-bitmap-mapping parameter can only be set "
                    "through QMP");
         break;
+    case MIGRATION_PARAMETER_X_VCPU_DIRTY_LIMIT_PERIOD:
+        p->has_x_vcpu_dirty_limit_period = true;
+        visit_type_size(v, param, &p->x_vcpu_dirty_limit_period, &err);
+        break;
     default:
         assert(0);
     }
diff --git a/migration/options.c b/migration/options.c
index 5a9505adf7..1de63ba775 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -80,6 +80,8 @@
 #define DEFINE_PROP_MIG_CAP(name, x)             \
     DEFINE_PROP_BOOL(name, MigrationState, capabilities[x], false)
 
+#define DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT_PERIOD     1000    /* milliseconds */
+
 Property migration_properties[] = {
     DEFINE_PROP_BOOL("store-global-state", MigrationState,
                      store_global_state, true),
@@ -163,6 +165,9 @@ Property migration_properties[] = {
     DEFINE_PROP_STRING("tls-creds", MigrationState, parameters.tls_creds),
     DEFINE_PROP_STRING("tls-hostname", MigrationState, parameters.tls_hostname),
     DEFINE_PROP_STRING("tls-authz", MigrationState, parameters.tls_authz),
+    DEFINE_PROP_UINT64("x-vcpu-dirty-limit-period", MigrationState,
+                       parameters.x_vcpu_dirty_limit_period,
+                       DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT_PERIOD),
 
     /* Migration capabilities */
     DEFINE_PROP_MIG_CAP("x-xbzrle", MIGRATION_CAPABILITY_XBZRLE),
@@ -908,6 +913,9 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp)
                        s->parameters.block_bitmap_mapping);
     }
 
+    params->has_x_vcpu_dirty_limit_period = true;
+    params->x_vcpu_dirty_limit_period = s->parameters.x_vcpu_dirty_limit_period;
+
     return params;
 }
 
@@ -940,6 +948,7 @@ void migrate_params_init(MigrationParameters *params)
     params->has_announce_max = true;
     params->has_announce_rounds = true;
     params->has_announce_step = true;
+    params->has_x_vcpu_dirty_limit_period = true;
 }
 
 /*
@@ -1100,6 +1109,15 @@ bool migrate_params_check(MigrationParameters *params, Error **errp)
     }
 #endif
 
+    if (params->has_x_vcpu_dirty_limit_period &&
+        (params->x_vcpu_dirty_limit_period < 1 ||
+         params->x_vcpu_dirty_limit_period > 1000)) {
+        error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
+                   "x-vcpu-dirty-limit-period",
+                   "a value between 1 and 1000");
+        return false;
+    }
+
     return true;
 }
 
@@ -1199,6 +1217,11 @@ static void migrate_params_test_apply(MigrateSetParameters *params,
         dest->has_block_bitmap_mapping = true;
         dest->block_bitmap_mapping = params->block_bitmap_mapping;
     }
+
+    if (params->has_x_vcpu_dirty_limit_period) {
+        dest->x_vcpu_dirty_limit_period =
+            params->x_vcpu_dirty_limit_period;
+    }
 }
 
 static void migrate_params_apply(MigrateSetParameters *params, Error **errp)
@@ -1317,6 +1340,11 @@ static void migrate_params_apply(MigrateSetParameters *params, Error **errp)
             QAPI_CLONE(BitmapMigrationNodeAliasList,
                        params->block_bitmap_mapping);
     }
+
+    if (params->has_x_vcpu_dirty_limit_period) {
+        s->parameters.x_vcpu_dirty_limit_period =
+            params->x_vcpu_dirty_limit_period;
+    }
 }
 
 void qmp_migrate_set_parameters(MigrateSetParameters *params, Error **errp)
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 05/26] qapi/migration: Introduce vcpu-dirty-limit parameters
  2023-07-24 13:06 [PATCH 00/26] Migration PULL 2023-07-24 Juan Quintela
                   ` (3 preceding siblings ...)
  2023-07-24 13:06 ` [PATCH 04/26] qapi/migration: Introduce x-vcpu-dirty-limit-period parameter Juan Quintela
@ 2023-07-24 13:06 ` Juan Quintela
  2023-07-24 13:06 ` [PATCH 06/26] migration: Introduce dirty-limit capability Juan Quintela
                   ` (21 subsequent siblings)
  26 siblings, 0 replies; 30+ messages in thread
From: Juan Quintela @ 2023-07-24 13:06 UTC (permalink / raw)
  To: qemu-devel
  Cc: Laurent Vivier, Thomas Huth, Markus Armbruster, libvir-list,
	Paolo Bonzini, Peter Xu, Eric Blake, Juan Quintela, Leonardo Bras,
	Hyman Huang(黄勇)

From: Hyman Huang(黄勇) <yong.huang@smartx.com>

Introduce "vcpu-dirty-limit" migration parameter used
to limit dirty page rate during live migration.

"vcpu-dirty-limit" and "x-vcpu-dirty-limit-period" are
two dirty-limit-related migration parameters, which can
be set before and during live migration by qmp
migrate-set-parameters.

This two parameters are used to help implement the dirty
page rate limit algo of migration.

Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <168618975839.6361.17407633874747688653-3@git.sr.ht>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 qapi/migration.json            | 18 +++++++++++++++---
 migration/migration-hmp-cmds.c |  8 ++++++++
 migration/options.c            | 21 +++++++++++++++++++++
 3 files changed, 44 insertions(+), 3 deletions(-)

diff --git a/qapi/migration.json b/qapi/migration.json
index 384b768e03..aa590dbf0e 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -793,6 +793,9 @@
 #                             live migration. Should be in the range 1 to 1000ms,
 #                             defaults to 1000ms. (Since 8.1)
 #
+# @vcpu-dirty-limit: Dirtyrate limit (MB/s) during live migration.
+#                    Defaults to 1. (Since 8.1)
+#
 # Features:
 #
 # @unstable: Members @x-checkpoint-delay and @x-vcpu-dirty-limit-period
@@ -816,7 +819,8 @@
            'max-cpu-throttle', 'multifd-compression',
            'multifd-zlib-level', 'multifd-zstd-level',
            'block-bitmap-mapping',
-           { 'name': 'x-vcpu-dirty-limit-period', 'features': ['unstable'] } ] }
+           { 'name': 'x-vcpu-dirty-limit-period', 'features': ['unstable'] },
+           'vcpu-dirty-limit'] }
 
 ##
 # @MigrateSetParameters:
@@ -955,6 +959,9 @@
 #                             live migration. Should be in the range 1 to 1000ms,
 #                             defaults to 1000ms. (Since 8.1)
 #
+# @vcpu-dirty-limit: Dirtyrate limit (MB/s) during live migration.
+#                    Defaults to 1. (Since 8.1)
+#
 # Features:
 #
 # @unstable: Members @x-checkpoint-delay and @x-vcpu-dirty-limit-period
@@ -995,7 +1002,8 @@
             '*multifd-zstd-level': 'uint8',
             '*block-bitmap-mapping': [ 'BitmapMigrationNodeAlias' ],
             '*x-vcpu-dirty-limit-period': { 'type': 'uint64',
-                                            'features': [ 'unstable' ] } } }
+                                            'features': [ 'unstable' ] },
+            '*vcpu-dirty-limit': 'uint64'} }
 
 ##
 # @migrate-set-parameters:
@@ -1154,6 +1162,9 @@
 #                             live migration. Should be in the range 1 to 1000ms,
 #                             defaults to 1000ms. (Since 8.1)
 #
+# @vcpu-dirty-limit: Dirtyrate limit (MB/s) during live migration.
+#                    Defaults to 1. (Since 8.1)
+#
 # Features:
 #
 # @unstable: Members @x-checkpoint-delay and @x-vcpu-dirty-limit-period
@@ -1191,7 +1202,8 @@
             '*multifd-zstd-level': 'uint8',
             '*block-bitmap-mapping': [ 'BitmapMigrationNodeAlias' ],
             '*x-vcpu-dirty-limit-period': { 'type': 'uint64',
-                                            'features': [ 'unstable' ] } } }
+                                            'features': [ 'unstable' ] },
+            '*vcpu-dirty-limit': 'uint64'} }
 
 ##
 # @query-migrate-parameters:
diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c
index 352e9ec716..35e8020bbf 100644
--- a/migration/migration-hmp-cmds.c
+++ b/migration/migration-hmp-cmds.c
@@ -368,6 +368,10 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict)
         monitor_printf(mon, "%s: %" PRIu64 " ms\n",
         MigrationParameter_str(MIGRATION_PARAMETER_X_VCPU_DIRTY_LIMIT_PERIOD),
         params->x_vcpu_dirty_limit_period);
+
+        monitor_printf(mon, "%s: %" PRIu64 " MB/s\n",
+            MigrationParameter_str(MIGRATION_PARAMETER_VCPU_DIRTY_LIMIT),
+            params->vcpu_dirty_limit);
     }
 
     qapi_free_MigrationParameters(params);
@@ -628,6 +632,10 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict)
         p->has_x_vcpu_dirty_limit_period = true;
         visit_type_size(v, param, &p->x_vcpu_dirty_limit_period, &err);
         break;
+    case MIGRATION_PARAMETER_VCPU_DIRTY_LIMIT:
+        p->has_vcpu_dirty_limit = true;
+        visit_type_size(v, param, &p->vcpu_dirty_limit, &err);
+        break;
     default:
         assert(0);
     }
diff --git a/migration/options.c b/migration/options.c
index 1de63ba775..7d2d98830e 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -81,6 +81,7 @@
     DEFINE_PROP_BOOL(name, MigrationState, capabilities[x], false)
 
 #define DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT_PERIOD     1000    /* milliseconds */
+#define DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT            1       /* MB/s */
 
 Property migration_properties[] = {
     DEFINE_PROP_BOOL("store-global-state", MigrationState,
@@ -168,6 +169,9 @@ Property migration_properties[] = {
     DEFINE_PROP_UINT64("x-vcpu-dirty-limit-period", MigrationState,
                        parameters.x_vcpu_dirty_limit_period,
                        DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT_PERIOD),
+    DEFINE_PROP_UINT64("vcpu-dirty-limit", MigrationState,
+                       parameters.vcpu_dirty_limit,
+                       DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT),
 
     /* Migration capabilities */
     DEFINE_PROP_MIG_CAP("x-xbzrle", MIGRATION_CAPABILITY_XBZRLE),
@@ -915,6 +919,8 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp)
 
     params->has_x_vcpu_dirty_limit_period = true;
     params->x_vcpu_dirty_limit_period = s->parameters.x_vcpu_dirty_limit_period;
+    params->has_vcpu_dirty_limit = true;
+    params->vcpu_dirty_limit = s->parameters.vcpu_dirty_limit;
 
     return params;
 }
@@ -949,6 +955,7 @@ void migrate_params_init(MigrationParameters *params)
     params->has_announce_rounds = true;
     params->has_announce_step = true;
     params->has_x_vcpu_dirty_limit_period = true;
+    params->has_vcpu_dirty_limit = true;
 }
 
 /*
@@ -1118,6 +1125,14 @@ bool migrate_params_check(MigrationParameters *params, Error **errp)
         return false;
     }
 
+    if (params->has_vcpu_dirty_limit &&
+        (params->vcpu_dirty_limit < 1)) {
+        error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
+                   "vcpu_dirty_limit",
+                   "is invalid, it must greater then 1 MB/s");
+        return false;
+    }
+
     return true;
 }
 
@@ -1222,6 +1237,9 @@ static void migrate_params_test_apply(MigrateSetParameters *params,
         dest->x_vcpu_dirty_limit_period =
             params->x_vcpu_dirty_limit_period;
     }
+    if (params->has_vcpu_dirty_limit) {
+        dest->vcpu_dirty_limit = params->vcpu_dirty_limit;
+    }
 }
 
 static void migrate_params_apply(MigrateSetParameters *params, Error **errp)
@@ -1345,6 +1363,9 @@ static void migrate_params_apply(MigrateSetParameters *params, Error **errp)
         s->parameters.x_vcpu_dirty_limit_period =
             params->x_vcpu_dirty_limit_period;
     }
+    if (params->has_vcpu_dirty_limit) {
+        s->parameters.vcpu_dirty_limit = params->vcpu_dirty_limit;
+    }
 }
 
 void qmp_migrate_set_parameters(MigrateSetParameters *params, Error **errp)
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 06/26] migration: Introduce dirty-limit capability
  2023-07-24 13:06 [PATCH 00/26] Migration PULL 2023-07-24 Juan Quintela
                   ` (4 preceding siblings ...)
  2023-07-24 13:06 ` [PATCH 05/26] qapi/migration: Introduce vcpu-dirty-limit parameters Juan Quintela
@ 2023-07-24 13:06 ` Juan Quintela
  2023-07-24 13:06 ` [PATCH 07/26] migration: Refactor auto-converge capability logic Juan Quintela
                   ` (20 subsequent siblings)
  26 siblings, 0 replies; 30+ messages in thread
From: Juan Quintela @ 2023-07-24 13:06 UTC (permalink / raw)
  To: qemu-devel
  Cc: Laurent Vivier, Thomas Huth, Markus Armbruster, libvir-list,
	Paolo Bonzini, Peter Xu, Eric Blake, Juan Quintela, Leonardo Bras,
	Hyman Huang(黄勇)

From: Hyman Huang(黄勇) <yong.huang@smartx.com>

Introduce migration dirty-limit capability, which can
be turned on before live migration and limit dirty
page rate durty live migration.

Introduce migrate_dirty_limit function to help check
if dirty-limit capability enabled during live migration.

Meanwhile, refactor vcpu_dirty_rate_stat_collect
so that period can be configured instead of hardcoded.

dirty-limit capability is kind of like auto-converge
but using dirty limit instead of traditional cpu-throttle
to throttle guest down. To enable this feature, turn on
the dirty-limit capability before live migration using
migrate-set-capabilities, and set the parameters
"x-vcpu-dirty-limit-period", "vcpu-dirty-limit" suitably
to speed up convergence.

Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <168618975839.6361.17407633874747688653-4@git.sr.ht>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 qapi/migration.json  | 13 ++++++++++++-
 migration/options.h  |  1 +
 migration/options.c  | 23 ++++++++++++++++++++++-
 softmmu/dirtylimit.c | 18 ++++++++++++++----
 4 files changed, 49 insertions(+), 6 deletions(-)

diff --git a/qapi/migration.json b/qapi/migration.json
index aa590dbf0e..cc51835cdd 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -497,6 +497,16 @@
 #     are present.  'return-path' capability must be enabled to use
 #     it.  (since 8.1)
 #
+# @dirty-limit: If enabled, migration will use the dirty-limit algo to
+#               throttle down guest instead of auto-converge algo.
+#               Throttle algo only works when vCPU's dirtyrate greater
+#               than 'vcpu-dirty-limit', read processes in guest os
+#               aren't penalized any more, so this algo can improve
+#               performance of vCPU during live migration. This is an
+#               optional performance feature and should not affect the
+#               correctness of the existing auto-converge algo.
+#               (since 8.1)
+#
 # Features:
 #
 # @unstable: Members @x-colo and @x-ignore-shared are experimental.
@@ -512,7 +522,8 @@
            'dirty-bitmaps', 'postcopy-blocktime', 'late-block-activate',
            { 'name': 'x-ignore-shared', 'features': [ 'unstable' ] },
            'validate-uuid', 'background-snapshot',
-           'zero-copy-send', 'postcopy-preempt', 'switchover-ack'] }
+           'zero-copy-send', 'postcopy-preempt', 'switchover-ack',
+           'dirty-limit'] }
 
 ##
 # @MigrationCapabilityStatus:
diff --git a/migration/options.h b/migration/options.h
index 9aaf363322..045e2a41a2 100644
--- a/migration/options.h
+++ b/migration/options.h
@@ -29,6 +29,7 @@ bool migrate_block(void);
 bool migrate_colo(void);
 bool migrate_compress(void);
 bool migrate_dirty_bitmaps(void);
+bool migrate_dirty_limit(void);
 bool migrate_events(void);
 bool migrate_ignore_shared(void);
 bool migrate_late_block_activate(void);
diff --git a/migration/options.c b/migration/options.c
index 7d2d98830e..7d83f190d6 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -27,6 +27,7 @@
 #include "qemu-file.h"
 #include "ram.h"
 #include "options.h"
+#include "sysemu/kvm.h"
 
 /* Maximum migrate downtime set to 2000 seconds */
 #define MAX_MIGRATE_DOWNTIME_SECONDS 2000
@@ -196,7 +197,7 @@ Property migration_properties[] = {
 #endif
     DEFINE_PROP_MIG_CAP("x-switchover-ack",
                         MIGRATION_CAPABILITY_SWITCHOVER_ACK),
-
+    DEFINE_PROP_MIG_CAP("x-dirty-limit", MIGRATION_CAPABILITY_DIRTY_LIMIT),
     DEFINE_PROP_END_OF_LIST(),
 };
 
@@ -242,6 +243,13 @@ bool migrate_dirty_bitmaps(void)
     return s->capabilities[MIGRATION_CAPABILITY_DIRTY_BITMAPS];
 }
 
+bool migrate_dirty_limit(void)
+{
+    MigrationState *s = migrate_get_current();
+
+    return s->capabilities[MIGRATION_CAPABILITY_DIRTY_LIMIT];
+}
+
 bool migrate_events(void)
 {
     MigrationState *s = migrate_get_current();
@@ -572,6 +580,19 @@ bool migrate_caps_check(bool *old_caps, bool *new_caps, Error **errp)
             return false;
         }
     }
+    if (new_caps[MIGRATION_CAPABILITY_DIRTY_LIMIT]) {
+        if (new_caps[MIGRATION_CAPABILITY_AUTO_CONVERGE]) {
+            error_setg(errp, "dirty-limit conflicts with auto-converge"
+                       " either of then available currently");
+            return false;
+        }
+
+        if (!kvm_enabled() || !kvm_dirty_ring_enabled()) {
+            error_setg(errp, "dirty-limit requires KVM with accelerator"
+                   " property 'dirty-ring-size' set");
+            return false;
+        }
+    }
 
     return true;
 }
diff --git a/softmmu/dirtylimit.c b/softmmu/dirtylimit.c
index e80201097a..942d876523 100644
--- a/softmmu/dirtylimit.c
+++ b/softmmu/dirtylimit.c
@@ -24,6 +24,9 @@
 #include "hw/boards.h"
 #include "sysemu/kvm.h"
 #include "trace.h"
+#include "migration/misc.h"
+#include "migration/migration.h"
+#include "migration/options.h"
 
 /*
  * Dirtylimit stop working if dirty page rate error
@@ -75,14 +78,21 @@ static bool dirtylimit_quit;
 
 static void vcpu_dirty_rate_stat_collect(void)
 {
+    MigrationState *s = migrate_get_current();
     VcpuStat stat;
     int i = 0;
+    int64_t period = DIRTYLIMIT_CALC_TIME_MS;
+
+    if (migrate_dirty_limit() &&
+        migration_is_active(s)) {
+        period = s->parameters.x_vcpu_dirty_limit_period;
+    }
 
     /* calculate vcpu dirtyrate */
-    vcpu_calculate_dirtyrate(DIRTYLIMIT_CALC_TIME_MS,
-                             &stat,
-                             GLOBAL_DIRTY_LIMIT,
-                             false);
+    vcpu_calculate_dirtyrate(period,
+                              &stat,
+                              GLOBAL_DIRTY_LIMIT,
+                              false);
 
     for (i = 0; i < stat.nvcpu; i++) {
         vcpu_dirty_rate_stat->stat.rates[i].id = i;
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 07/26] migration: Refactor auto-converge capability logic
  2023-07-24 13:06 [PATCH 00/26] Migration PULL 2023-07-24 Juan Quintela
                   ` (5 preceding siblings ...)
  2023-07-24 13:06 ` [PATCH 06/26] migration: Introduce dirty-limit capability Juan Quintela
@ 2023-07-24 13:06 ` Juan Quintela
  2023-07-24 13:06 ` [PATCH 08/26] migration: Put the detection logic before auto-converge checking Juan Quintela
                   ` (19 subsequent siblings)
  26 siblings, 0 replies; 30+ messages in thread
From: Juan Quintela @ 2023-07-24 13:06 UTC (permalink / raw)
  To: qemu-devel
  Cc: Laurent Vivier, Thomas Huth, Markus Armbruster, libvir-list,
	Paolo Bonzini, Peter Xu, Eric Blake, Juan Quintela, Leonardo Bras,
	Hyman Huang(黄勇)

From: Hyman Huang(黄勇) <yong.huang@smartx.com>

Check if block migration is running before throttling
guest down in auto-converge way.

Note that this modification is kind of like code clean,
because block migration does not depend on auto-converge
capability, so the order of checks can be adjusted.

Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <168618975839.6361.17407633874747688653-5@git.sr.ht>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration/ram.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/migration/ram.c b/migration/ram.c
index 0ada6477e8..f31de47a47 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -995,7 +995,11 @@ static void migration_trigger_throttle(RAMState *rs)
     /* During block migration the auto-converge logic incorrectly detects
      * that ram migration makes no progress. Avoid this by disabling the
      * throttling logic during the bulk phase of block migration. */
-    if (migrate_auto_converge() && !blk_mig_bulk_active()) {
+    if (blk_mig_bulk_active()) {
+        return;
+    }
+
+    if (migrate_auto_converge()) {
         /* The following detection logic can be refined later. For now:
            Check to see if the ratio between dirtied bytes and the approx.
            amount of bytes that just got transferred since the last time
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 08/26] migration: Put the detection logic before auto-converge checking
  2023-07-24 13:06 [PATCH 00/26] Migration PULL 2023-07-24 Juan Quintela
                   ` (6 preceding siblings ...)
  2023-07-24 13:06 ` [PATCH 07/26] migration: Refactor auto-converge capability logic Juan Quintela
@ 2023-07-24 13:06 ` Juan Quintela
  2023-07-24 13:06 ` [PATCH 09/26] migration: Implement dirty-limit convergence algo Juan Quintela
                   ` (18 subsequent siblings)
  26 siblings, 0 replies; 30+ messages in thread
From: Juan Quintela @ 2023-07-24 13:06 UTC (permalink / raw)
  To: qemu-devel
  Cc: Laurent Vivier, Thomas Huth, Markus Armbruster, libvir-list,
	Paolo Bonzini, Peter Xu, Eric Blake, Juan Quintela, Leonardo Bras,
	Hyman Huang(黄勇)

From: Hyman Huang(黄勇) <yong.huang@smartx.com>

This commit is prepared for the implementation of dirty-limit
convergence algo.

The detection logic of throttling condition can apply to both
auto-converge and dirty-limit algo, putting it's position
before the checking logic for auto-converge feature.

Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-ID: <168733225273.5845.15871826788879741674-6@git.sr.ht>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration/ram.c | 21 +++++++++++----------
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index f31de47a47..1d9300f4c5 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -999,17 +999,18 @@ static void migration_trigger_throttle(RAMState *rs)
         return;
     }
 
-    if (migrate_auto_converge()) {
-        /* The following detection logic can be refined later. For now:
-           Check to see if the ratio between dirtied bytes and the approx.
-           amount of bytes that just got transferred since the last time
-           we were in this routine reaches the threshold. If that happens
-           twice, start or increase throttling. */
-
-        if ((bytes_dirty_period > bytes_dirty_threshold) &&
-            (++rs->dirty_rate_high_cnt >= 2)) {
+    /*
+     * The following detection logic can be refined later. For now:
+     * Check to see if the ratio between dirtied bytes and the approx.
+     * amount of bytes that just got transferred since the last time
+     * we were in this routine reaches the threshold. If that happens
+     * twice, start or increase throttling.
+     */
+    if ((bytes_dirty_period > bytes_dirty_threshold) &&
+        (++rs->dirty_rate_high_cnt >= 2)) {
+        rs->dirty_rate_high_cnt = 0;
+        if (migrate_auto_converge()) {
             trace_migration_throttle();
-            rs->dirty_rate_high_cnt = 0;
             mig_throttle_guest_down(bytes_dirty_period,
                                     bytes_dirty_threshold);
         }
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 09/26] migration: Implement dirty-limit convergence algo
  2023-07-24 13:06 [PATCH 00/26] Migration PULL 2023-07-24 Juan Quintela
                   ` (7 preceding siblings ...)
  2023-07-24 13:06 ` [PATCH 08/26] migration: Put the detection logic before auto-converge checking Juan Quintela
@ 2023-07-24 13:06 ` Juan Quintela
  2023-07-24 13:06 ` [PATCH 10/26] migration: Extend query-migrate to provide dirty page limit info Juan Quintela
                   ` (17 subsequent siblings)
  26 siblings, 0 replies; 30+ messages in thread
From: Juan Quintela @ 2023-07-24 13:06 UTC (permalink / raw)
  To: qemu-devel
  Cc: Laurent Vivier, Thomas Huth, Markus Armbruster, libvir-list,
	Paolo Bonzini, Peter Xu, Eric Blake, Juan Quintela, Leonardo Bras,
	Hyman Huang(黄勇)

From: Hyman Huang(黄勇) <yong.huang@smartx.com>

Implement dirty-limit convergence algo for live migration,
which is kind of like auto-converge algo but using dirty-limit
instead of cpu throttle to make migration convergent.

Enable dirty page limit if dirty_rate_high_cnt greater than 2
when dirty-limit capability enabled, Disable dirty-limit if
migration be canceled.

Note that "set_vcpu_dirty_limit", "cancel_vcpu_dirty_limit"
commands are not allowed during dirty-limit live migration.

Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <168733225273.5845.15871826788879741674-7@git.sr.ht>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration/migration.c  |  3 +++
 migration/ram.c        | 36 ++++++++++++++++++++++++++++++++++++
 softmmu/dirtylimit.c   | 29 +++++++++++++++++++++++++++++
 migration/trace-events |  1 +
 4 files changed, 69 insertions(+)

diff --git a/migration/migration.c b/migration/migration.c
index ae49d42eab..49332251e8 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -166,6 +166,9 @@ void migration_cancel(const Error *error)
     if (error) {
         migrate_set_error(current_migration, error);
     }
+    if (migrate_dirty_limit()) {
+        qmp_cancel_vcpu_dirty_limit(false, -1, NULL);
+    }
     migrate_fd_cancel(current_migration);
 }
 
diff --git a/migration/ram.c b/migration/ram.c
index 1d9300f4c5..9040d66e61 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -46,6 +46,7 @@
 #include "qapi/error.h"
 #include "qapi/qapi-types-migration.h"
 #include "qapi/qapi-events-migration.h"
+#include "qapi/qapi-commands-migration.h"
 #include "qapi/qmp/qerror.h"
 #include "trace.h"
 #include "exec/ram_addr.h"
@@ -59,6 +60,8 @@
 #include "multifd.h"
 #include "sysemu/runstate.h"
 #include "options.h"
+#include "sysemu/dirtylimit.h"
+#include "sysemu/kvm.h"
 
 #include "hw/boards.h" /* for machine_dump_guest_core() */
 
@@ -984,6 +987,37 @@ static void migration_update_rates(RAMState *rs, int64_t end_time)
     }
 }
 
+/*
+ * Enable dirty-limit to throttle down the guest
+ */
+static void migration_dirty_limit_guest(void)
+{
+    /*
+     * dirty page rate quota for all vCPUs fetched from
+     * migration parameter 'vcpu_dirty_limit'
+     */
+    static int64_t quota_dirtyrate;
+    MigrationState *s = migrate_get_current();
+
+    /*
+     * If dirty limit already enabled and migration parameter
+     * vcpu-dirty-limit untouched.
+     */
+    if (dirtylimit_in_service() &&
+        quota_dirtyrate == s->parameters.vcpu_dirty_limit) {
+        return;
+    }
+
+    quota_dirtyrate = s->parameters.vcpu_dirty_limit;
+
+    /*
+     * Set all vCPU a quota dirtyrate, note that the second
+     * parameter will be ignored if setting all vCPU for the vm
+     */
+    qmp_set_vcpu_dirty_limit(false, -1, quota_dirtyrate, NULL);
+    trace_migration_dirty_limit_guest(quota_dirtyrate);
+}
+
 static void migration_trigger_throttle(RAMState *rs)
 {
     uint64_t threshold = migrate_throttle_trigger_threshold();
@@ -1013,6 +1047,8 @@ static void migration_trigger_throttle(RAMState *rs)
             trace_migration_throttle();
             mig_throttle_guest_down(bytes_dirty_period,
                                     bytes_dirty_threshold);
+        } else if (migrate_dirty_limit()) {
+            migration_dirty_limit_guest();
         }
     }
 }
diff --git a/softmmu/dirtylimit.c b/softmmu/dirtylimit.c
index 942d876523..a6d854d161 100644
--- a/softmmu/dirtylimit.c
+++ b/softmmu/dirtylimit.c
@@ -436,6 +436,23 @@ static void dirtylimit_cleanup(void)
     dirtylimit_state_finalize();
 }
 
+/*
+ * dirty page rate limit is not allowed to set if migration
+ * is running with dirty-limit capability enabled.
+ */
+static bool dirtylimit_is_allowed(void)
+{
+    MigrationState *ms = migrate_get_current();
+
+    if (migration_is_running(ms->state) &&
+        (!qemu_thread_is_self(&ms->thread)) &&
+        migrate_dirty_limit() &&
+        dirtylimit_in_service()) {
+        return false;
+    }
+    return true;
+}
+
 void qmp_cancel_vcpu_dirty_limit(bool has_cpu_index,
                                  int64_t cpu_index,
                                  Error **errp)
@@ -449,6 +466,12 @@ void qmp_cancel_vcpu_dirty_limit(bool has_cpu_index,
         return;
     }
 
+    if (!dirtylimit_is_allowed()) {
+        error_setg(errp, "can't cancel dirty page rate limit while"
+                   " migration is running");
+        return;
+    }
+
     if (!dirtylimit_in_service()) {
         return;
     }
@@ -499,6 +522,12 @@ void qmp_set_vcpu_dirty_limit(bool has_cpu_index,
         return;
     }
 
+    if (!dirtylimit_is_allowed()) {
+        error_setg(errp, "can't set dirty page rate limit while"
+                   " migration is running");
+        return;
+    }
+
     if (!dirty_rate) {
         qmp_cancel_vcpu_dirty_limit(has_cpu_index, cpu_index, errp);
         return;
diff --git a/migration/trace-events b/migration/trace-events
index 5259c1044b..580895e86e 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -93,6 +93,7 @@ migration_bitmap_sync_start(void) ""
 migration_bitmap_sync_end(uint64_t dirty_pages) "dirty_pages %" PRIu64
 migration_bitmap_clear_dirty(char *str, uint64_t start, uint64_t size, unsigned long page) "rb %s start 0x%"PRIx64" size 0x%"PRIx64" page 0x%lx"
 migration_throttle(void) ""
+migration_dirty_limit_guest(int64_t dirtyrate) "guest dirty page rate limit %" PRIi64 " MB/s"
 ram_discard_range(const char *rbname, uint64_t start, size_t len) "%s: start: %" PRIx64 " %zx"
 ram_load_loop(const char *rbname, uint64_t addr, int flags, void *host) "%s: addr: 0x%" PRIx64 " flags: 0x%x host: %p"
 ram_load_postcopy_loop(int channel, uint64_t addr, int flags) "chan=%d addr=0x%" PRIx64 " flags=0x%x"
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 10/26] migration: Extend query-migrate to provide dirty page limit info
  2023-07-24 13:06 [PATCH 00/26] Migration PULL 2023-07-24 Juan Quintela
                   ` (8 preceding siblings ...)
  2023-07-24 13:06 ` [PATCH 09/26] migration: Implement dirty-limit convergence algo Juan Quintela
@ 2023-07-24 13:06 ` Juan Quintela
  2023-07-24 13:06 ` [PATCH 11/26] migration-test: Be consistent for ppc Juan Quintela
                   ` (16 subsequent siblings)
  26 siblings, 0 replies; 30+ messages in thread
From: Juan Quintela @ 2023-07-24 13:06 UTC (permalink / raw)
  To: qemu-devel
  Cc: Laurent Vivier, Thomas Huth, Markus Armbruster, libvir-list,
	Paolo Bonzini, Peter Xu, Eric Blake, Juan Quintela, Leonardo Bras,
	Hyman Huang(黄勇)

From: Hyman Huang(黄勇) <yong.huang@smartx.com>

Extend query-migrate to provide throttle time and estimated
ring full time with dirty-limit capability enabled, through which
we can observe if dirty limit take effect during live migration.

Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-ID: <168733225273.5845.15871826788879741674-8@git.sr.ht>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 qapi/migration.json            | 16 +++++++++++++-
 include/sysemu/dirtylimit.h    |  2 ++
 migration/migration-hmp-cmds.c | 10 +++++++++
 migration/migration.c          | 10 +++++++++
 softmmu/dirtylimit.c           | 39 ++++++++++++++++++++++++++++++++++
 5 files changed, 76 insertions(+), 1 deletion(-)

diff --git a/qapi/migration.json b/qapi/migration.json
index cc51835cdd..ebc15e2782 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -250,6 +250,18 @@
 #     blocked.  Present and non-empty when migration is blocked.
 #     (since 6.0)
 #
+# @dirty-limit-throttle-time-per-round: Maximum throttle time (in microseconds) of virtual
+#                                       CPUs each dirty ring full round, which shows how
+#                                       MigrationCapability dirty-limit affects the guest
+#                                       during live migration. (since 8.1)
+#
+# @dirty-limit-ring-full-time: Estimated average dirty ring full time (in microseconds)
+#                              each dirty ring full round, note that the value equals
+#                              dirty ring memory size divided by average dirty page rate
+#                              of virtual CPU, which can be used to observe the average
+#                              memory load of virtual CPU indirectly. Note that zero
+#                              means guest doesn't dirty memory (since 8.1)
+#
 # Since: 0.14
 ##
 { 'struct': 'MigrationInfo',
@@ -267,7 +279,9 @@
            '*postcopy-blocktime' : 'uint32',
            '*postcopy-vcpu-blocktime': ['uint32'],
            '*compression': 'CompressionStats',
-           '*socket-address': ['SocketAddress'] } }
+           '*socket-address': ['SocketAddress'],
+           '*dirty-limit-throttle-time-per-round': 'uint64',
+           '*dirty-limit-ring-full-time': 'uint64'} }
 
 ##
 # @query-migrate:
diff --git a/include/sysemu/dirtylimit.h b/include/sysemu/dirtylimit.h
index 8d2c1f3a6b..d11ebbbbdb 100644
--- a/include/sysemu/dirtylimit.h
+++ b/include/sysemu/dirtylimit.h
@@ -34,4 +34,6 @@ void dirtylimit_set_vcpu(int cpu_index,
 void dirtylimit_set_all(uint64_t quota,
                         bool enable);
 void dirtylimit_vcpu_execute(CPUState *cpu);
+uint64_t dirtylimit_throttle_time_per_round(void);
+uint64_t dirtylimit_ring_full_time(void);
 #endif
diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c
index 35e8020bbf..c115ef2d23 100644
--- a/migration/migration-hmp-cmds.c
+++ b/migration/migration-hmp-cmds.c
@@ -190,6 +190,16 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict)
                        info->cpu_throttle_percentage);
     }
 
+    if (info->has_dirty_limit_throttle_time_per_round) {
+        monitor_printf(mon, "dirty-limit throttle time: %" PRIu64 " us\n",
+                       info->dirty_limit_throttle_time_per_round);
+    }
+
+    if (info->has_dirty_limit_ring_full_time) {
+        monitor_printf(mon, "dirty-limit ring full time: %" PRIu64 " us\n",
+                       info->dirty_limit_ring_full_time);
+    }
+
     if (info->has_postcopy_blocktime) {
         monitor_printf(mon, "postcopy blocktime: %u\n",
                        info->postcopy_blocktime);
diff --git a/migration/migration.c b/migration/migration.c
index 49332251e8..1ea7512291 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -64,6 +64,7 @@
 #include "yank_functions.h"
 #include "sysemu/qtest.h"
 #include "options.h"
+#include "sysemu/dirtylimit.h"
 
 static NotifierList migration_state_notifiers =
     NOTIFIER_LIST_INITIALIZER(migration_state_notifiers);
@@ -974,6 +975,15 @@ static void populate_ram_info(MigrationInfo *info, MigrationState *s)
         info->ram->dirty_pages_rate =
            stat64_get(&mig_stats.dirty_pages_rate);
     }
+
+    if (migrate_dirty_limit() && dirtylimit_in_service()) {
+        info->has_dirty_limit_throttle_time_per_round = true;
+        info->dirty_limit_throttle_time_per_round =
+                            dirtylimit_throttle_time_per_round();
+
+        info->has_dirty_limit_ring_full_time = true;
+        info->dirty_limit_ring_full_time = dirtylimit_ring_full_time();
+    }
 }
 
 static void populate_disk_info(MigrationInfo *info)
diff --git a/softmmu/dirtylimit.c b/softmmu/dirtylimit.c
index a6d854d161..3c275ee55b 100644
--- a/softmmu/dirtylimit.c
+++ b/softmmu/dirtylimit.c
@@ -565,6 +565,45 @@ out:
     hmp_handle_error(mon, err);
 }
 
+/* Return the max throttle time of each virtual CPU */
+uint64_t dirtylimit_throttle_time_per_round(void)
+{
+    CPUState *cpu;
+    int64_t max = 0;
+
+    CPU_FOREACH(cpu) {
+        if (cpu->throttle_us_per_full > max) {
+            max = cpu->throttle_us_per_full;
+        }
+    }
+
+    return max;
+}
+
+/*
+ * Estimate average dirty ring full time of each virtaul CPU.
+ * Return 0 if guest doesn't dirty memory.
+ */
+uint64_t dirtylimit_ring_full_time(void)
+{
+    CPUState *cpu;
+    uint64_t curr_rate = 0;
+    int nvcpus = 0;
+
+    CPU_FOREACH(cpu) {
+        if (cpu->running) {
+            nvcpus++;
+            curr_rate += vcpu_dirty_rate_get(cpu->cpu_index);
+        }
+    }
+
+    if (!curr_rate || !nvcpus) {
+        return 0;
+    }
+
+    return dirtylimit_dirty_ring_full_time(curr_rate / nvcpus);
+}
+
 static struct DirtyLimitInfo *dirtylimit_query_vcpu(int cpu_index)
 {
     DirtyLimitInfo *info = NULL;
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 11/26] migration-test: Be consistent for ppc
  2023-07-24 13:06 [PATCH 00/26] Migration PULL 2023-07-24 Juan Quintela
                   ` (9 preceding siblings ...)
  2023-07-24 13:06 ` [PATCH 10/26] migration: Extend query-migrate to provide dirty page limit info Juan Quintela
@ 2023-07-24 13:06 ` Juan Quintela
  2023-07-24 13:06 ` [PATCH 12/26] migration-test: Make machine_opts regular with other options Juan Quintela
                   ` (15 subsequent siblings)
  26 siblings, 0 replies; 30+ messages in thread
From: Juan Quintela @ 2023-07-24 13:06 UTC (permalink / raw)
  To: qemu-devel
  Cc: Laurent Vivier, Thomas Huth, Markus Armbruster, libvir-list,
	Paolo Bonzini, Peter Xu, Eric Blake, Juan Quintela, Leonardo Bras

It makes no sense that we don't have the same configuration on both sides.

Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Message-ID: <20230608224943.3877-2-quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 tests/qtest/migration-test.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index e256da1216..2296ed4bf5 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -748,7 +748,7 @@ static int test_migrate_start(QTestState **from, QTestState **to,
                                       "'nvramrc=hex .\" _\" begin %x %x "
                                       "do i c@ 1 + i c! 1000 +loop .\" B\" 0 "
                                       "until'", end_address, start_address);
-        arch_target = g_strdup("");
+        arch_target = g_strdup("-nodefaults");
     } else if (strcmp(arch, "aarch64") == 0) {
         init_bootfile(bootpath, aarch64_kernel, sizeof(aarch64_kernel));
         machine_opts = "virt,gic-version=max";
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 12/26] migration-test: Make machine_opts regular with other options
  2023-07-24 13:06 [PATCH 00/26] Migration PULL 2023-07-24 Juan Quintela
                   ` (10 preceding siblings ...)
  2023-07-24 13:06 ` [PATCH 11/26] migration-test: Be consistent for ppc Juan Quintela
@ 2023-07-24 13:06 ` Juan Quintela
  2023-07-24 13:06 ` [PATCH 13/26] migration-test: Create arch_opts Juan Quintela
                   ` (14 subsequent siblings)
  26 siblings, 0 replies; 30+ messages in thread
From: Juan Quintela @ 2023-07-24 13:06 UTC (permalink / raw)
  To: qemu-devel
  Cc: Laurent Vivier, Thomas Huth, Markus Armbruster, libvir-list,
	Paolo Bonzini, Peter Xu, Eric Blake, Juan Quintela, Leonardo Bras

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230608224943.3877-5-quintela@redhat.com>
---
 tests/qtest/migration-test.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 2296ed4bf5..f51a25e299 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -739,7 +739,7 @@ static int test_migrate_start(QTestState **from, QTestState **to,
         start_address = S390_TEST_MEM_START;
         end_address = S390_TEST_MEM_END;
     } else if (strcmp(arch, "ppc64") == 0) {
-        machine_opts = "vsmt=8";
+        machine_opts = "-machine vsmt=8";
         memory_size = "256M";
         start_address = PPC_TEST_MEM_START;
         end_address = PPC_TEST_MEM_END;
@@ -751,7 +751,7 @@ static int test_migrate_start(QTestState **from, QTestState **to,
         arch_target = g_strdup("-nodefaults");
     } else if (strcmp(arch, "aarch64") == 0) {
         init_bootfile(bootpath, aarch64_kernel, sizeof(aarch64_kernel));
-        machine_opts = "virt,gic-version=max";
+        machine_opts = "-machine virt,gic-version=max";
         memory_size = "150M";
         arch_source = g_strdup_printf("-cpu max "
                                       "-kernel %s",
@@ -791,14 +791,13 @@ static int test_migrate_start(QTestState **from, QTestState **to,
         shmem_opts = g_strdup("");
     }
 
-    cmd_source = g_strdup_printf("-accel kvm%s -accel tcg%s%s "
+    cmd_source = g_strdup_printf("-accel kvm%s -accel tcg %s "
                                  "-name source,debug-threads=on "
                                  "-m %s "
                                  "-serial file:%s/src_serial "
                                  "%s %s %s %s",
                                  args->use_dirty_ring ?
                                  ",dirty-ring-size=4096" : "",
-                                 machine_opts ? " -machine " : "",
                                  machine_opts ? machine_opts : "",
                                  memory_size, tmpfs,
                                  arch_source, shmem_opts,
@@ -811,7 +810,7 @@ static int test_migrate_start(QTestState **from, QTestState **to,
                                      &got_src_stop);
     }
 
-    cmd_target = g_strdup_printf("-accel kvm%s -accel tcg%s%s "
+    cmd_target = g_strdup_printf("-accel kvm%s -accel tcg %s "
                                  "-name target,debug-threads=on "
                                  "-m %s "
                                  "-serial file:%s/dest_serial "
@@ -819,7 +818,6 @@ static int test_migrate_start(QTestState **from, QTestState **to,
                                  "%s %s %s %s",
                                  args->use_dirty_ring ?
                                  ",dirty-ring-size=4096" : "",
-                                 machine_opts ? " -machine " : "",
                                  machine_opts ? machine_opts : "",
                                  memory_size, tmpfs, uri,
                                  arch_target, shmem_opts,
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 13/26] migration-test: Create arch_opts
  2023-07-24 13:06 [PATCH 00/26] Migration PULL 2023-07-24 Juan Quintela
                   ` (11 preceding siblings ...)
  2023-07-24 13:06 ` [PATCH 12/26] migration-test: Make machine_opts regular with other options Juan Quintela
@ 2023-07-24 13:06 ` Juan Quintela
  2023-07-24 13:06 ` [PATCH 14/26] migration-test: machine_opts is really arch specific Juan Quintela
                   ` (13 subsequent siblings)
  26 siblings, 0 replies; 30+ messages in thread
From: Juan Quintela @ 2023-07-24 13:06 UTC (permalink / raw)
  To: qemu-devel
  Cc: Laurent Vivier, Thomas Huth, Markus Armbruster, libvir-list,
	Paolo Bonzini, Peter Xu, Eric Blake, Juan Quintela, Leonardo Bras

This will contain the options needed for both source and target.

Reviewed-by: Peter Xu <peterx@redhat.com>
Message-ID: <20230608224943.3877-6-quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 tests/qtest/migration-test.c | 30 +++++++++++++++---------------
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index f51a25e299..c723f083da 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -702,6 +702,8 @@ static int test_migrate_start(QTestState **from, QTestState **to,
 {
     g_autofree gchar *arch_source = NULL;
     g_autofree gchar *arch_target = NULL;
+    /* options for source and target */
+    g_autofree gchar *arch_opts = NULL;
     g_autofree gchar *cmd_source = NULL;
     g_autofree gchar *cmd_target = NULL;
     const gchar *ignore_stderr;
@@ -727,15 +729,13 @@ static int test_migrate_start(QTestState **from, QTestState **to,
         assert(sizeof(x86_bootsect) == 512);
         init_bootfile(bootpath, x86_bootsect, sizeof(x86_bootsect));
         memory_size = "150M";
-        arch_source = g_strdup_printf("-drive file=%s,format=raw", bootpath);
-        arch_target = g_strdup(arch_source);
+        arch_opts = g_strdup_printf("-drive file=%s,format=raw", bootpath);
         start_address = X86_TEST_MEM_START;
         end_address = X86_TEST_MEM_END;
     } else if (g_str_equal(arch, "s390x")) {
         init_bootfile(bootpath, s390x_elf, sizeof(s390x_elf));
         memory_size = "128M";
-        arch_source = g_strdup_printf("-bios %s", bootpath);
-        arch_target = g_strdup(arch_source);
+        arch_opts = g_strdup_printf("-bios %s", bootpath);
         start_address = S390_TEST_MEM_START;
         end_address = S390_TEST_MEM_END;
     } else if (strcmp(arch, "ppc64") == 0) {
@@ -743,20 +743,16 @@ static int test_migrate_start(QTestState **from, QTestState **to,
         memory_size = "256M";
         start_address = PPC_TEST_MEM_START;
         end_address = PPC_TEST_MEM_END;
-        arch_source = g_strdup_printf("-nodefaults "
-                                      "-prom-env 'use-nvramrc?=true' -prom-env "
+        arch_source = g_strdup_printf("-prom-env 'use-nvramrc?=true' -prom-env "
                                       "'nvramrc=hex .\" _\" begin %x %x "
                                       "do i c@ 1 + i c! 1000 +loop .\" B\" 0 "
                                       "until'", end_address, start_address);
-        arch_target = g_strdup("-nodefaults");
+        arch_opts = g_strdup("-nodefaults");
     } else if (strcmp(arch, "aarch64") == 0) {
         init_bootfile(bootpath, aarch64_kernel, sizeof(aarch64_kernel));
         machine_opts = "-machine virt,gic-version=max";
         memory_size = "150M";
-        arch_source = g_strdup_printf("-cpu max "
-                                      "-kernel %s",
-                                      bootpath);
-        arch_target = g_strdup(arch_source);
+        arch_opts = g_strdup_printf("-cpu max -kernel %s", bootpath);
         start_address = ARM_TEST_MEM_START;
         end_address = ARM_TEST_MEM_END;
 
@@ -795,12 +791,14 @@ static int test_migrate_start(QTestState **from, QTestState **to,
                                  "-name source,debug-threads=on "
                                  "-m %s "
                                  "-serial file:%s/src_serial "
-                                 "%s %s %s %s",
+                                 "%s %s %s %s %s",
                                  args->use_dirty_ring ?
                                  ",dirty-ring-size=4096" : "",
                                  machine_opts ? machine_opts : "",
                                  memory_size, tmpfs,
-                                 arch_source, shmem_opts,
+                                 arch_opts ? arch_opts : "",
+                                 arch_source ? arch_source : "",
+                                 shmem_opts,
                                  args->opts_source ? args->opts_source : "",
                                  ignore_stderr);
     if (!args->only_target) {
@@ -815,12 +813,14 @@ static int test_migrate_start(QTestState **from, QTestState **to,
                                  "-m %s "
                                  "-serial file:%s/dest_serial "
                                  "-incoming %s "
-                                 "%s %s %s %s",
+                                 "%s %s %s %s %s",
                                  args->use_dirty_ring ?
                                  ",dirty-ring-size=4096" : "",
                                  machine_opts ? machine_opts : "",
                                  memory_size, tmpfs, uri,
-                                 arch_target, shmem_opts,
+                                 arch_opts ? arch_opts : "",
+                                 arch_target ? arch_target : "",
+                                 shmem_opts,
                                  args->opts_target ? args->opts_target : "",
                                  ignore_stderr);
     *to = qtest_init(cmd_target);
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 14/26] migration-test: machine_opts is really arch specific
  2023-07-24 13:06 [PATCH 00/26] Migration PULL 2023-07-24 Juan Quintela
                   ` (12 preceding siblings ...)
  2023-07-24 13:06 ` [PATCH 13/26] migration-test: Create arch_opts Juan Quintela
@ 2023-07-24 13:06 ` Juan Quintela
  2023-07-24 13:06 ` [PATCH 15/26] migration.json: Don't use space before colon Juan Quintela
                   ` (12 subsequent siblings)
  26 siblings, 0 replies; 30+ messages in thread
From: Juan Quintela @ 2023-07-24 13:06 UTC (permalink / raw)
  To: qemu-devel
  Cc: Laurent Vivier, Thomas Huth, Markus Armbruster, libvir-list,
	Paolo Bonzini, Peter Xu, Eric Blake, Juan Quintela, Leonardo Bras

And it needs to be in both source and target, so put it on arch_opts.

Reviewed-by: Peter Xu <peterx@redhat.com>
Message-ID: <20230608224943.3877-7-quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 tests/qtest/migration-test.c | 14 +++++---------
 1 file changed, 5 insertions(+), 9 deletions(-)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index c723f083da..fd145e38d9 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -711,7 +711,6 @@ static int test_migrate_start(QTestState **from, QTestState **to,
     g_autofree char *shmem_opts = NULL;
     g_autofree char *shmem_path = NULL;
     const char *arch = qtest_get_arch();
-    const char *machine_opts = NULL;
     const char *memory_size;
 
     if (args->use_shmem) {
@@ -739,7 +738,6 @@ static int test_migrate_start(QTestState **from, QTestState **to,
         start_address = S390_TEST_MEM_START;
         end_address = S390_TEST_MEM_END;
     } else if (strcmp(arch, "ppc64") == 0) {
-        machine_opts = "-machine vsmt=8";
         memory_size = "256M";
         start_address = PPC_TEST_MEM_START;
         end_address = PPC_TEST_MEM_END;
@@ -747,12 +745,12 @@ static int test_migrate_start(QTestState **from, QTestState **to,
                                       "'nvramrc=hex .\" _\" begin %x %x "
                                       "do i c@ 1 + i c! 1000 +loop .\" B\" 0 "
                                       "until'", end_address, start_address);
-        arch_opts = g_strdup("-nodefaults");
+        arch_opts = g_strdup("-nodefaults -machine vsmt=8");
     } else if (strcmp(arch, "aarch64") == 0) {
         init_bootfile(bootpath, aarch64_kernel, sizeof(aarch64_kernel));
-        machine_opts = "-machine virt,gic-version=max";
         memory_size = "150M";
-        arch_opts = g_strdup_printf("-cpu max -kernel %s", bootpath);
+        arch_opts = g_strdup_printf("-machine virt,gic-version=max -cpu max "
+                                    "-kernel %s", bootpath);
         start_address = ARM_TEST_MEM_START;
         end_address = ARM_TEST_MEM_END;
 
@@ -787,14 +785,13 @@ static int test_migrate_start(QTestState **from, QTestState **to,
         shmem_opts = g_strdup("");
     }
 
-    cmd_source = g_strdup_printf("-accel kvm%s -accel tcg %s "
+    cmd_source = g_strdup_printf("-accel kvm%s -accel tcg "
                                  "-name source,debug-threads=on "
                                  "-m %s "
                                  "-serial file:%s/src_serial "
                                  "%s %s %s %s %s",
                                  args->use_dirty_ring ?
                                  ",dirty-ring-size=4096" : "",
-                                 machine_opts ? machine_opts : "",
                                  memory_size, tmpfs,
                                  arch_opts ? arch_opts : "",
                                  arch_source ? arch_source : "",
@@ -808,7 +805,7 @@ static int test_migrate_start(QTestState **from, QTestState **to,
                                      &got_src_stop);
     }
 
-    cmd_target = g_strdup_printf("-accel kvm%s -accel tcg %s "
+    cmd_target = g_strdup_printf("-accel kvm%s -accel tcg "
                                  "-name target,debug-threads=on "
                                  "-m %s "
                                  "-serial file:%s/dest_serial "
@@ -816,7 +813,6 @@ static int test_migrate_start(QTestState **from, QTestState **to,
                                  "%s %s %s %s %s",
                                  args->use_dirty_ring ?
                                  ",dirty-ring-size=4096" : "",
-                                 machine_opts ? machine_opts : "",
                                  memory_size, tmpfs, uri,
                                  arch_opts ? arch_opts : "",
                                  arch_target ? arch_target : "",
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 15/26] migration.json: Don't use space before colon
  2023-07-24 13:06 [PATCH 00/26] Migration PULL 2023-07-24 Juan Quintela
                   ` (13 preceding siblings ...)
  2023-07-24 13:06 ` [PATCH 14/26] migration-test: machine_opts is really arch specific Juan Quintela
@ 2023-07-24 13:06 ` Juan Quintela
  2023-07-24 13:06 ` [PATCH 16/26] migration: skipped field is really obsolete Juan Quintela
                   ` (11 subsequent siblings)
  26 siblings, 0 replies; 30+ messages in thread
From: Juan Quintela @ 2023-07-24 13:06 UTC (permalink / raw)
  To: qemu-devel
  Cc: Laurent Vivier, Thomas Huth, Markus Armbruster, libvir-list,
	Paolo Bonzini, Peter Xu, Eric Blake, Juan Quintela, Leonardo Bras

So all the file is consistent.

Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20230612191604.2219-1-quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 qapi/migration.json | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/qapi/migration.json b/qapi/migration.json
index ebc15e2782..7ccb28e64f 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -67,13 +67,13 @@
 { 'struct': 'MigrationStats',
   'data': {'transferred': 'int', 'remaining': 'int', 'total': 'int' ,
            'duplicate': 'int', 'skipped': 'int', 'normal': 'int',
-           'normal-bytes': 'int', 'dirty-pages-rate' : 'int',
-           'mbps' : 'number', 'dirty-sync-count' : 'int',
-           'postcopy-requests' : 'int', 'page-size' : 'int',
-           'multifd-bytes' : 'uint64', 'pages-per-second' : 'uint64',
-           'precopy-bytes' : 'uint64', 'downtime-bytes' : 'uint64',
-           'postcopy-bytes' : 'uint64',
-           'dirty-sync-missed-zero-copy' : 'uint64' } }
+           'normal-bytes': 'int', 'dirty-pages-rate': 'int',
+           'mbps': 'number', 'dirty-sync-count': 'int',
+           'postcopy-requests': 'int', 'page-size': 'int',
+           'multifd-bytes': 'uint64', 'pages-per-second': 'uint64',
+           'precopy-bytes': 'uint64', 'downtime-bytes': 'uint64',
+           'postcopy-bytes': 'uint64',
+           'dirty-sync-missed-zero-copy': 'uint64' } }
 
 ##
 # @XBZRLECacheStats:
@@ -276,7 +276,7 @@
            '*cpu-throttle-percentage': 'int',
            '*error-desc': 'str',
            '*blocked-reasons': ['str'],
-           '*postcopy-blocktime' : 'uint32',
+           '*postcopy-blocktime': 'uint32',
            '*postcopy-vcpu-blocktime': ['uint32'],
            '*compression': 'CompressionStats',
            '*socket-address': ['SocketAddress'],
@@ -551,7 +551,7 @@
 # Since: 1.2
 ##
 { 'struct': 'MigrationCapabilityStatus',
-  'data': { 'capability' : 'MigrationCapability', 'state' : 'bool' } }
+  'data': { 'capability': 'MigrationCapability', 'state': 'bool' } }
 
 ##
 # @migrate-set-capabilities:
@@ -1634,7 +1634,7 @@
 # Since: 2.9
 ##
 { 'command': 'xen-set-replication',
-  'data': { 'enable': 'bool', 'primary': 'bool', '*failover' : 'bool' },
+  'data': { 'enable': 'bool', 'primary': 'bool', '*failover': 'bool' },
   'if': 'CONFIG_REPLICATION' }
 
 ##
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 16/26] migration: skipped field is really obsolete.
  2023-07-24 13:06 [PATCH 00/26] Migration PULL 2023-07-24 Juan Quintela
                   ` (14 preceding siblings ...)
  2023-07-24 13:06 ` [PATCH 15/26] migration.json: Don't use space before colon Juan Quintela
@ 2023-07-24 13:06 ` Juan Quintela
  2023-07-24 13:06 ` [PATCH 17/26] docs/migration: Update postcopy bits Juan Quintela
                   ` (10 subsequent siblings)
  26 siblings, 0 replies; 30+ messages in thread
From: Juan Quintela @ 2023-07-24 13:06 UTC (permalink / raw)
  To: qemu-devel
  Cc: Laurent Vivier, Thomas Huth, Markus Armbruster, libvir-list,
	Paolo Bonzini, Peter Xu, Eric Blake, Juan Quintela, Leonardo Bras,
	Daniel P . Berrangé

Has return zero for more than 10 years.

Specifically we introduced the field in 1.5.0

commit f1c72795af573b24a7da5eb52375c9aba8a37972
Author: Peter Lieven <pl@kamp.de>
Date:   Tue Mar 26 10:58:37 2013 +0100

    migration: do not sent zero pages in bulk stage

    during bulk stage of ram migration if a page is a
    zero page do not send it at all.
    the memory at the destination reads as zero anyway.

    even if there is an madvise with QEMU_MADV_DONTNEED
    at the target upon receipt of a zero page I have observed
    that the target starts swapping if the memory is overcommitted.
    it seems that the pages are dropped asynchronously.

    this patch also updates QMP to return the number of
    skipped pages in MigrationStats.

but removed its usage in 1.5.3

commit 9ef051e5536b6368a1076046ec6c4ec4ac12b5c6
Author: Peter Lieven <pl@kamp.de>
Date:   Mon Jun 10 12:14:19 2013 +0200

    Revert "migration: do not sent zero pages in bulk stage"

    Not sending zero pages breaks migration if a page is zero
    at the source but not at the destination. This can e.g. happen
    if different BIOS versions are used at source and destination.
    It has also been reported that migration on pseries is completely
    broken with this patch.

    This effectively reverts commit f1c72795af573b24a7da5eb52375c9aba8a37972.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20230612193344.3796-2-quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 docs/about/deprecated.rst | 10 ++++++++++
 qapi/migration.json       | 12 ++++++++++--
 2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/docs/about/deprecated.rst b/docs/about/deprecated.rst
index 02ea5a839f..1c35f55666 100644
--- a/docs/about/deprecated.rst
+++ b/docs/about/deprecated.rst
@@ -451,3 +451,13 @@ both, older and future versions of QEMU.
 The ``blacklist`` config file option has been renamed to ``block-rpcs``
 (to be in sync with the renaming of the corresponding command line
 option).
+
+Migration
+---------
+
+``skipped`` MigrationStats field (since 8.1)
+''''''''''''''''''''''''''''''''''''''''''''
+
+``skipped`` field in Migration stats has been deprecated.  It hasn't
+been used for more than 10 years.
+
diff --git a/qapi/migration.json b/qapi/migration.json
index 7ccb28e64f..bc9ae3fef7 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -23,7 +23,8 @@
 #
 # @duplicate: number of duplicate (zero) pages (since 1.2)
 #
-# @skipped: number of skipped zero pages (since 1.5)
+# @skipped: number of skipped zero pages. Always zero, only provided for
+#     compatibility (since 1.5)
 #
 # @normal: number of normal pages (since 1.2)
 #
@@ -62,11 +63,18 @@
 #     between 0 and @dirty-sync-count * @multifd-channels.  (since
 #     7.1)
 #
+# Features:
+#
+# @deprecated: Member @skipped is always zero since 1.5.3
+#
 # Since: 0.14
+#
 ##
 { 'struct': 'MigrationStats',
   'data': {'transferred': 'int', 'remaining': 'int', 'total': 'int' ,
-           'duplicate': 'int', 'skipped': 'int', 'normal': 'int',
+           'duplicate': 'int',
+           'skipped': { 'type': 'int', 'features': ['deprecated'] },
+           'normal': 'int',
            'normal-bytes': 'int', 'dirty-pages-rate': 'int',
            'mbps': 'number', 'dirty-sync-count': 'int',
            'postcopy-requests': 'int', 'page-size': 'int',
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 17/26] docs/migration: Update postcopy bits
  2023-07-24 13:06 [PATCH 00/26] Migration PULL 2023-07-24 Juan Quintela
                   ` (15 preceding siblings ...)
  2023-07-24 13:06 ` [PATCH 16/26] migration: skipped field is really obsolete Juan Quintela
@ 2023-07-24 13:06 ` Juan Quintela
  2023-07-24 13:06 ` [PATCH 18/26] migration: Update error description whenever migration fails Juan Quintela
                   ` (9 subsequent siblings)
  26 siblings, 0 replies; 30+ messages in thread
From: Juan Quintela @ 2023-07-24 13:06 UTC (permalink / raw)
  To: qemu-devel
  Cc: Laurent Vivier, Thomas Huth, Markus Armbruster, libvir-list,
	Paolo Bonzini, Peter Xu, Eric Blake, Juan Quintela, Leonardo Bras,
	Laszlo Ersek

From: Peter Xu <peterx@redhat.com>

We have postcopy recovery but not reflected in the document, do an update
for that.

Add a very small section on postcopy preempt.

Touch up the pagemap section, dropping the unsent map because it's already
been dropped in the source code in commit 1e7cf8c323 ("migration/postcopy:
unsentmap is not necessary for postcopy").

Touch up the postcopy section to remove "network connection" failures as
downside, because now it's not fatal and can be recovered.  Suggested by
Laszlo.

Acked-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230706115611.371048-1-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 docs/devel/migration.rst | 94 ++++++++++++++++++++++++++++------------
 1 file changed, 67 insertions(+), 27 deletions(-)

diff --git a/docs/devel/migration.rst b/docs/devel/migration.rst
index 6f65c23b47..c3e1400c0c 100644
--- a/docs/devel/migration.rst
+++ b/docs/devel/migration.rst
@@ -594,8 +594,7 @@ Postcopy
 'Postcopy' migration is a way to deal with migrations that refuse to converge
 (or take too long to converge) its plus side is that there is an upper bound on
 the amount of migration traffic and time it takes, the down side is that during
-the postcopy phase, a failure of *either* side or the network connection causes
-the guest to be lost.
+the postcopy phase, a failure of *either* side causes the guest to be lost.
 
 In postcopy the destination CPUs are started before all the memory has been
 transferred, and accesses to pages that are yet to be transferred cause
@@ -721,6 +720,42 @@ processing.
    is no longer used by migration, while the listen thread carries on servicing
    page data until the end of migration.
 
+Postcopy Recovery
+-----------------
+
+Comparing to precopy, postcopy is special on error handlings.  When any
+error happens (in this case, mostly network errors), QEMU cannot easily
+fail a migration because VM data resides in both source and destination
+QEMU instances.  On the other hand, when issue happens QEMU on both sides
+will go into a paused state.  It'll need a recovery phase to continue a
+paused postcopy migration.
+
+The recovery phase normally contains a few steps:
+
+  - When network issue occurs, both QEMU will go into PAUSED state
+
+  - When the network is recovered (or a new network is provided), the admin
+    can setup the new channel for migration using QMP command
+    'migrate-recover' on destination node, preparing for a resume.
+
+  - On source host, the admin can continue the interrupted postcopy
+    migration using QMP command 'migrate' with resume=true flag set.
+
+  - After the connection is re-established, QEMU will continue the postcopy
+    migration on both sides.
+
+During a paused postcopy migration, the VM can logically still continue
+running, and it will not be impacted from any page access to pages that
+were already migrated to destination VM before the interruption happens.
+However, if any of the missing pages got accessed on destination VM, the VM
+thread will be halted waiting for the page to be migrated, it means it can
+be halted until the recovery is complete.
+
+The impact of accessing missing pages can be relevant to different
+configurations of the guest.  For example, when with async page fault
+enabled, logically the guest can proactively schedule out the threads
+accessing missing pages.
+
 Postcopy states
 ---------------
 
@@ -765,36 +800,31 @@ ADVISE->DISCARD->LISTEN->RUNNING->END
     (although it can't do the cleanup it would do as it
     finishes a normal migration).
 
+ - Paused
+
+    Postcopy can run into a paused state (normally on both sides when
+    happens), where all threads will be temporarily halted mostly due to
+    network errors.  When reaching paused state, migration will make sure
+    the qemu binary on both sides maintain the data without corrupting
+    the VM.  To continue the migration, the admin needs to fix the
+    migration channel using the QMP command 'migrate-recover' on the
+    destination node, then resume the migration using QMP command 'migrate'
+    again on source node, with resume=true flag set.
+
  - End
 
     The listen thread can now quit, and perform the cleanup of migration
     state, the migration is now complete.
 
-Source side page maps
----------------------
-
-The source side keeps two bitmaps during postcopy; 'the migration bitmap'
-and 'unsent map'.  The 'migration bitmap' is basically the same as in
-the precopy case, and holds a bit to indicate that page is 'dirty' -
-i.e. needs sending.  During the precopy phase this is updated as the CPU
-dirties pages, however during postcopy the CPUs are stopped and nothing
-should dirty anything any more.
-
-The 'unsent map' is used for the transition to postcopy. It is a bitmap that
-has a bit cleared whenever a page is sent to the destination, however during
-the transition to postcopy mode it is combined with the migration bitmap
-to form a set of pages that:
-
-   a) Have been sent but then redirtied (which must be discarded)
-   b) Have not yet been sent - which also must be discarded to cause any
-      transparent huge pages built during precopy to be broken.
-
-Note that the contents of the unsentmap are sacrificed during the calculation
-of the discard set and thus aren't valid once in postcopy.  The dirtymap
-is still valid and is used to ensure that no page is sent more than once.  Any
-request for a page that has already been sent is ignored.  Duplicate requests
-such as this can happen as a page is sent at about the same time the
-destination accesses it.
+Source side page map
+--------------------
+
+The 'migration bitmap' in postcopy is basically the same as in the precopy,
+where each of the bit to indicate that page is 'dirty' - i.e. needs
+sending.  During the precopy phase this is updated as the CPU dirties
+pages, however during postcopy the CPUs are stopped and nothing should
+dirty anything any more. Instead, dirty bits are cleared when the relevant
+pages are sent during postcopy.
 
 Postcopy with hugepages
 -----------------------
@@ -853,6 +883,16 @@ Retro-fitting postcopy to existing clients is possible:
      guest memory access is made while holding a lock then all other
      threads waiting for that lock will also be blocked.
 
+Postcopy Preemption Mode
+------------------------
+
+Postcopy preempt is a new capability introduced in 8.0 QEMU release, it
+allows urgent pages (those got page fault requested from destination QEMU
+explicitly) to be sent in a separate preempt channel, rather than queued in
+the background migration channel.  Anyone who cares about latencies of page
+faults during a postcopy migration should enable this feature.  By default,
+it's not enabled.
+
 Firmware
 ========
 
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 18/26] migration: Update error description whenever migration fails
  2023-07-24 13:06 [PATCH 00/26] Migration PULL 2023-07-24 Juan Quintela
                   ` (16 preceding siblings ...)
  2023-07-24 13:06 ` [PATCH 17/26] docs/migration: Update postcopy bits Juan Quintela
@ 2023-07-24 13:06 ` Juan Quintela
  2023-07-24 13:06 ` [PATCH 19/26] migration: enforce multifd and postcopy preempt to be set before incoming Juan Quintela
                   ` (8 subsequent siblings)
  26 siblings, 0 replies; 30+ messages in thread
From: Juan Quintela @ 2023-07-24 13:06 UTC (permalink / raw)
  To: qemu-devel
  Cc: Laurent Vivier, Thomas Huth, Markus Armbruster, libvir-list,
	Paolo Bonzini, Peter Xu, Eric Blake, Juan Quintela, Leonardo Bras,
	Tejus GK

From: Tejus GK <tejus.gk@nutanix.com>

There are places in migration.c where the migration is marked failed with
MIGRATION_STATUS_FAILED, but the failure reason is never updated. Hence
libvirt doesn't know why the migration failed when it queries for it.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Tejus GK <tejus.gk@nutanix.com>
Message-ID: <20230621130940.178659-2-tejus.gk@nutanix.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration/migration.c | 19 ++++++++++++-------
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 1ea7512291..5528acb65e 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1689,7 +1689,7 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk,
         if (!resume_requested) {
             yank_unregister_instance(MIGRATION_YANK_INSTANCE);
         }
-        error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "uri",
+        error_setg(&local_err, QERR_INVALID_PARAMETER_VALUE, "uri",
                    "a valid migration protocol");
         migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
                           MIGRATION_STATUS_FAILED);
@@ -2082,7 +2082,7 @@ migration_wait_main_channel(MigrationState *ms)
  * Switch from normal iteration to postcopy
  * Returns non-0 on error
  */
-static int postcopy_start(MigrationState *ms)
+static int postcopy_start(MigrationState *ms, Error **errp)
 {
     int ret;
     QIOChannelBuffer *bioc;
@@ -2192,7 +2192,7 @@ static int postcopy_start(MigrationState *ms)
      */
     ret = qemu_file_get_error(ms->to_dst_file);
     if (ret) {
-        error_report("postcopy_start: Migration stream errored (pre package)");
+        error_setg(errp, "postcopy_start: Migration stream errored (pre package)");
         goto fail_closefb;
     }
 
@@ -2229,7 +2229,7 @@ static int postcopy_start(MigrationState *ms)
 
     ret = qemu_file_get_error(ms->to_dst_file);
     if (ret) {
-        error_report("postcopy_start: Migration stream errored");
+        error_setg(errp, "postcopy_start: Migration stream errored");
         migrate_set_state(&ms->state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
                               MIGRATION_STATUS_FAILED);
     }
@@ -2750,6 +2750,7 @@ typedef enum {
 static MigIterateState migration_iteration_run(MigrationState *s)
 {
     uint64_t must_precopy, can_postcopy;
+    Error *local_err = NULL;
     bool in_postcopy = s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE;
     bool can_switchover = migration_can_switchover(s);
 
@@ -2773,8 +2774,9 @@ static MigIterateState migration_iteration_run(MigrationState *s)
     /* Still a significant amount to transfer */
     if (!in_postcopy && must_precopy <= s->threshold_size && can_switchover &&
         qatomic_read(&s->start_postcopy)) {
-        if (postcopy_start(s)) {
-            error_report("%s: postcopy failed to start", __func__);
+        if (postcopy_start(s, &local_err)) {
+            migrate_set_error(s, local_err);
+            error_report_err(local_err);
         }
         return MIG_ITERATE_SKIP;
     }
@@ -3265,8 +3267,10 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
      */
     if (migrate_postcopy_ram() || migrate_return_path()) {
         if (open_return_path_on_source(s, !resume)) {
-            error_report("Unable to open return-path for postcopy");
+            error_setg(&local_err, "Unable to open return-path for postcopy");
             migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED);
+            migrate_set_error(s, local_err);
+            error_report_err(local_err);
             migrate_fd_cleanup(s);
             return;
         }
@@ -3290,6 +3294,7 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
     }
 
     if (multifd_save_setup(&local_err) != 0) {
+        migrate_set_error(s, local_err);
         error_report_err(local_err);
         migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
                           MIGRATION_STATUS_FAILED);
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 19/26] migration: enforce multifd and postcopy preempt to be set before incoming
  2023-07-24 13:06 [PATCH 00/26] Migration PULL 2023-07-24 Juan Quintela
                   ` (17 preceding siblings ...)
  2023-07-24 13:06 ` [PATCH 18/26] migration: Update error description whenever migration fails Juan Quintela
@ 2023-07-24 13:06 ` Juan Quintela
  2023-07-24 13:06 ` [PATCH 20/26] qtest/migration-tests.c: use "-incoming defer" for postcopy tests Juan Quintela
                   ` (7 subsequent siblings)
  26 siblings, 0 replies; 30+ messages in thread
From: Juan Quintela @ 2023-07-24 13:06 UTC (permalink / raw)
  To: qemu-devel
  Cc: Laurent Vivier, Thomas Huth, Markus Armbruster, libvir-list,
	Paolo Bonzini, Peter Xu, Eric Blake, Juan Quintela, Leonardo Bras,
	Wei Wang

From: Wei Wang <wei.w.wang@intel.com>

qemu_start_incoming_migration needs to check the number of multifd
channels or postcopy ram channels to configure the backlog parameter (i.e.
the maximum length to which the queue of pending connections for sockfd
may grow) of listen(). So enforce the usage of postcopy-preempt and
multifd as below:
- need to use "-incoming defer" on the destination; and
- set_capability and set_parameter need to be done before migrate_incoming

Otherwise, disable the use of the features and report error messages to
remind users to adjust the commands.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230606101910.20456-2-wei.w.wang@intel.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Acked-by: Juan Quintela <quintela@redhat.com>
---
 migration/options.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/migration/options.c b/migration/options.c
index 7d83f190d6..1d1e1321b0 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -441,6 +441,11 @@ INITIALIZE_MIGRATE_CAPS_SET(check_caps_background_snapshot,
     MIGRATION_CAPABILITY_VALIDATE_UUID,
     MIGRATION_CAPABILITY_ZERO_COPY_SEND);
 
+static bool migrate_incoming_started(void)
+{
+    return !!migration_incoming_get_current()->transport_data;
+}
+
 /**
  * @migration_caps_check - check capability compatibility
  *
@@ -564,6 +569,12 @@ bool migrate_caps_check(bool *old_caps, bool *new_caps, Error **errp)
             error_setg(errp, "Postcopy preempt not compatible with compress");
             return false;
         }
+
+        if (migrate_incoming_started()) {
+            error_setg(errp,
+                       "Postcopy preempt must be set before incoming starts");
+            return false;
+        }
     }
 
     if (new_caps[MIGRATION_CAPABILITY_MULTIFD]) {
@@ -571,6 +582,10 @@ bool migrate_caps_check(bool *old_caps, bool *new_caps, Error **errp)
             error_setg(errp, "Multifd is not compatible with compress");
             return false;
         }
+        if (migrate_incoming_started()) {
+            error_setg(errp, "Multifd must be set before incoming starts");
+            return false;
+        }
     }
 
     if (new_caps[MIGRATION_CAPABILITY_SWITCHOVER_ACK]) {
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 20/26] qtest/migration-tests.c: use "-incoming defer" for postcopy tests
  2023-07-24 13:06 [PATCH 00/26] Migration PULL 2023-07-24 Juan Quintela
                   ` (18 preceding siblings ...)
  2023-07-24 13:06 ` [PATCH 19/26] migration: enforce multifd and postcopy preempt to be set before incoming Juan Quintela
@ 2023-07-24 13:06 ` Juan Quintela
  2023-07-24 13:06 ` [PATCH 21/26] qemu-file: Rename qemu_file_transferred_ fast -> noflush Juan Quintela
                   ` (6 subsequent siblings)
  26 siblings, 0 replies; 30+ messages in thread
From: Juan Quintela @ 2023-07-24 13:06 UTC (permalink / raw)
  To: qemu-devel
  Cc: Laurent Vivier, Thomas Huth, Markus Armbruster, libvir-list,
	Paolo Bonzini, Peter Xu, Eric Blake, Juan Quintela, Leonardo Bras,
	Wei Wang

From: Wei Wang <wei.w.wang@intel.com>

The Postcopy preempt capability is expected to be set before incoming
starts, so change the postcopy tests to start with deferred incoming and
call migrate-incoming after the cap has been set.

Why the existing tests (without this patch) didn't fail?
There could be two reasons:
1) "backlog" specifies the number of pending connections. As long as the
   server accepts the connections faster than the clients side connecting,
   connection will succeed. For the preempt test, it uses only 2 channels,
   so very likely to not have pending connections.
2) per my tests (on kernel 6.2), the number of pending connections allowed
   is actually "backlog + 1", which is 2 in this case.
That said, the implementation of socket_start_incoming_migration_internal
expects "migrate defer" to be used, and for safety, change the test to
work with the expected usage.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230606101910.20456-3-wei.w.wang@intel.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 tests/qtest/migration-test.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index fd145e38d9..62d3f37021 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -1239,10 +1239,9 @@ static int migrate_postcopy_prepare(QTestState **from_ptr,
                                     QTestState **to_ptr,
                                     MigrateCommon *args)
 {
-    g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
     QTestState *from, *to;
 
-    if (test_migrate_start(&from, &to, uri, &args->start)) {
+    if (test_migrate_start(&from, &to, "defer", &args->start)) {
         return -1;
     }
 
@@ -1262,10 +1261,13 @@ static int migrate_postcopy_prepare(QTestState **from_ptr,
     migrate_ensure_non_converge(from);
 
     migrate_prepare_for_dirty_mem(from);
+    qtest_qmp_assert_success(to, "{ 'execute': 'migrate-incoming',"
+                             "  'arguments': { 'uri': 'tcp:127.0.0.1:0' }}");
 
     /* Wait for the first serial output from the source */
     wait_for_serial("src_serial");
 
+    g_autofree char *uri = migrate_get_socket_address(to, "socket-address");
     migrate_qmp(from, uri, "{}");
 
     migrate_wait_for_dirty_mem(from, to);
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 21/26] qemu-file: Rename qemu_file_transferred_ fast -> noflush
  2023-07-24 13:06 [PATCH 00/26] Migration PULL 2023-07-24 Juan Quintela
                   ` (19 preceding siblings ...)
  2023-07-24 13:06 ` [PATCH 20/26] qtest/migration-tests.c: use "-incoming defer" for postcopy tests Juan Quintela
@ 2023-07-24 13:06 ` Juan Quintela
  2023-07-24 13:06 ` [PATCH 22/26] migration: Change qemu_file_transferred to noflush Juan Quintela
                   ` (5 subsequent siblings)
  26 siblings, 0 replies; 30+ messages in thread
From: Juan Quintela @ 2023-07-24 13:06 UTC (permalink / raw)
  To: qemu-devel
  Cc: Laurent Vivier, Thomas Huth, Markus Armbruster, libvir-list,
	Paolo Bonzini, Peter Xu, Eric Blake, Juan Quintela, Leonardo Bras,
	Philippe Mathieu-Daudé

Fast don't say much.  Noflush indicates more clearly that it is like
qemu_file_transferred but without the flush.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230530183941.7223-2-quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration/qemu-file.h | 11 +++++------
 migration/qemu-file.c |  2 +-
 migration/savevm.c    |  4 ++--
 migration/vmstate.c   |  4 ++--
 4 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/migration/qemu-file.h b/migration/qemu-file.h
index e649718492..aa6eee66da 100644
--- a/migration/qemu-file.h
+++ b/migration/qemu-file.h
@@ -86,16 +86,15 @@ int qemu_fclose(QEMUFile *f);
 uint64_t qemu_file_transferred(QEMUFile *f);
 
 /*
- * qemu_file_transferred_fast:
+ * qemu_file_transferred_noflush:
  *
- * As qemu_file_transferred except for writable
- * files, where no flush is performed and the reported
- * amount will include the size of any queued buffers,
- * on top of the amount actually transferred.
+ * As qemu_file_transferred except for writable files, where no flush
+ * is performed and the reported amount will include the size of any
+ * queued buffers, on top of the amount actually transferred.
  *
  * Returns: the total bytes transferred and queued
  */
-uint64_t qemu_file_transferred_fast(QEMUFile *f);
+uint64_t qemu_file_transferred_noflush(QEMUFile *f);
 
 /*
  * put_buffer without copying the buffer.
diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index acc282654a..fdf115b5da 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -694,7 +694,7 @@ int coroutine_mixed_fn qemu_get_byte(QEMUFile *f)
     return result;
 }
 
-uint64_t qemu_file_transferred_fast(QEMUFile *f)
+uint64_t qemu_file_transferred_noflush(QEMUFile *f)
 {
     uint64_t ret = f->total_transferred;
     int i;
diff --git a/migration/savevm.c b/migration/savevm.c
index 95c2abf47c..a07070db62 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -927,9 +927,9 @@ static int vmstate_load(QEMUFile *f, SaveStateEntry *se)
 static void vmstate_save_old_style(QEMUFile *f, SaveStateEntry *se,
                                    JSONWriter *vmdesc)
 {
-    uint64_t old_offset = qemu_file_transferred_fast(f);
+    uint64_t old_offset = qemu_file_transferred_noflush(f);
     se->ops->save_state(f, se->opaque);
-    uint64_t size = qemu_file_transferred_fast(f) - old_offset;
+    uint64_t size = qemu_file_transferred_noflush(f) - old_offset;
 
     if (vmdesc) {
         json_writer_int64(vmdesc, "size", size);
diff --git a/migration/vmstate.c b/migration/vmstate.c
index af01d54b6f..31842c3afb 100644
--- a/migration/vmstate.c
+++ b/migration/vmstate.c
@@ -361,7 +361,7 @@ int vmstate_save_state_v(QEMUFile *f, const VMStateDescription *vmsd,
                 void *curr_elem = first_elem + size * i;
 
                 vmsd_desc_field_start(vmsd, vmdesc_loop, field, i, n_elems);
-                old_offset = qemu_file_transferred_fast(f);
+                old_offset = qemu_file_transferred_noflush(f);
                 if (field->flags & VMS_ARRAY_OF_POINTER) {
                     assert(curr_elem);
                     curr_elem = *(void **)curr_elem;
@@ -391,7 +391,7 @@ int vmstate_save_state_v(QEMUFile *f, const VMStateDescription *vmsd,
                     return ret;
                 }
 
-                written_bytes = qemu_file_transferred_fast(f) - old_offset;
+                written_bytes = qemu_file_transferred_noflush(f) - old_offset;
                 vmsd_desc_field_end(vmsd, vmdesc_loop, field, written_bytes, i);
 
                 /* Compressed arrays only care about the first element */
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 22/26] migration: Change qemu_file_transferred to noflush
  2023-07-24 13:06 [PATCH 00/26] Migration PULL 2023-07-24 Juan Quintela
                   ` (20 preceding siblings ...)
  2023-07-24 13:06 ` [PATCH 21/26] qemu-file: Rename qemu_file_transferred_ fast -> noflush Juan Quintela
@ 2023-07-24 13:06 ` Juan Quintela
  2023-07-24 13:06 ` [PATCH 23/26] qemu_file: Make qemu_file_is_writable() static Juan Quintela
                   ` (4 subsequent siblings)
  26 siblings, 0 replies; 30+ messages in thread
From: Juan Quintela @ 2023-07-24 13:06 UTC (permalink / raw)
  To: qemu-devel
  Cc: Laurent Vivier, Thomas Huth, Markus Armbruster, libvir-list,
	Paolo Bonzini, Peter Xu, Eric Blake, Juan Quintela, Leonardo Bras,
	Philippe Mathieu-Daudé

We do a qemu_fclose() just after that, that also does a qemu_fflush(),
so remove one qemu_fflush().

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230530183941.7223-3-quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration/savevm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index a07070db62..cc59ddaa87 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -3007,7 +3007,7 @@ bool save_snapshot(const char *name, bool overwrite, const char *vmstate,
         goto the_end;
     }
     ret = qemu_savevm_state(f, errp);
-    vm_state_size = qemu_file_transferred(f);
+    vm_state_size = qemu_file_transferred_noflush(f);
     ret2 = qemu_fclose(f);
     if (ret < 0) {
         goto the_end;
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 23/26] qemu_file: Make qemu_file_is_writable() static
  2023-07-24 13:06 [PATCH 00/26] Migration PULL 2023-07-24 Juan Quintela
                   ` (21 preceding siblings ...)
  2023-07-24 13:06 ` [PATCH 22/26] migration: Change qemu_file_transferred to noflush Juan Quintela
@ 2023-07-24 13:06 ` Juan Quintela
  2023-07-24 13:06 ` [PATCH 24/26] qemu-file: Simplify qemu_file_shutdown() Juan Quintela
                   ` (3 subsequent siblings)
  26 siblings, 0 replies; 30+ messages in thread
From: Juan Quintela @ 2023-07-24 13:06 UTC (permalink / raw)
  To: qemu-devel
  Cc: Laurent Vivier, Thomas Huth, Markus Armbruster, libvir-list,
	Paolo Bonzini, Peter Xu, Eric Blake, Juan Quintela, Leonardo Bras

It is not used outside of qemu_file, and it shouldn't.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230530183941.7223-19-quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration/qemu-file.h | 1 -
 migration/qemu-file.c | 2 +-
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/migration/qemu-file.h b/migration/qemu-file.h
index aa6eee66da..a081ef6c3f 100644
--- a/migration/qemu-file.h
+++ b/migration/qemu-file.h
@@ -103,7 +103,6 @@ uint64_t qemu_file_transferred_noflush(QEMUFile *f);
 void qemu_put_buffer_async(QEMUFile *f, const uint8_t *buf, size_t size,
                            bool may_free);
 bool qemu_file_mode_is_not_valid(const char *mode);
-bool qemu_file_is_writable(QEMUFile *f);
 
 #include "migration/qemu-file-types.h"
 
diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index fdf115b5da..9a89e17924 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -228,7 +228,7 @@ void qemu_file_set_error(QEMUFile *f, int ret)
     qemu_file_set_error_obj(f, ret, NULL);
 }
 
-bool qemu_file_is_writable(QEMUFile *f)
+static bool qemu_file_is_writable(QEMUFile *f)
 {
     return f->is_writable;
 }
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 24/26] qemu-file: Simplify qemu_file_shutdown()
  2023-07-24 13:06 [PATCH 00/26] Migration PULL 2023-07-24 Juan Quintela
                   ` (22 preceding siblings ...)
  2023-07-24 13:06 ` [PATCH 23/26] qemu_file: Make qemu_file_is_writable() static Juan Quintela
@ 2023-07-24 13:06 ` Juan Quintela
  2023-07-24 13:06 ` [PATCH 25/26] qemu-file: Make qemu_file_get_error_obj() static Juan Quintela
                   ` (2 subsequent siblings)
  26 siblings, 0 replies; 30+ messages in thread
From: Juan Quintela @ 2023-07-24 13:06 UTC (permalink / raw)
  To: qemu-devel
  Cc: Laurent Vivier, Thomas Huth, Markus Armbruster, libvir-list,
	Paolo Bonzini, Peter Xu, Eric Blake, Juan Quintela, Leonardo Bras

Reviewed-by: Peter Xu <peterx@redhat.com>
Message-ID: <20230530183941.7223-20-quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration/qemu-file.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index 9a89e17924..4c577bdff8 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -65,8 +65,6 @@ struct QEMUFile {
  */
 int qemu_file_shutdown(QEMUFile *f)
 {
-    int ret = 0;
-
     /*
      * We must set qemufile error before the real shutdown(), otherwise
      * there can be a race window where we thought IO all went though
@@ -96,10 +94,10 @@ int qemu_file_shutdown(QEMUFile *f)
     }
 
     if (qio_channel_shutdown(f->ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL) < 0) {
-        ret = -EIO;
+        return -EIO;
     }
 
-    return ret;
+    return 0;
 }
 
 bool qemu_file_mode_is_not_valid(const char *mode)
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 25/26] qemu-file: Make qemu_file_get_error_obj() static
  2023-07-24 13:06 [PATCH 00/26] Migration PULL 2023-07-24 Juan Quintela
                   ` (23 preceding siblings ...)
  2023-07-24 13:06 ` [PATCH 24/26] qemu-file: Simplify qemu_file_shutdown() Juan Quintela
@ 2023-07-24 13:06 ` Juan Quintela
  2023-07-24 13:06 ` [PATCH 26/26] migration/rdma: Split qemu_fopen_rdma() into input/output functions Juan Quintela
  2023-07-24 13:28 ` [PATCH 00/26] Migration PULL 2023-07-24 Thomas Huth
  26 siblings, 0 replies; 30+ messages in thread
From: Juan Quintela @ 2023-07-24 13:06 UTC (permalink / raw)
  To: qemu-devel
  Cc: Laurent Vivier, Thomas Huth, Markus Armbruster, libvir-list,
	Paolo Bonzini, Peter Xu, Eric Blake, Juan Quintela, Leonardo Bras

It was not used outside of qemu_file.c anyways.

Reviewed-by: Peter Xu <peterx@redhat.com>
Message-ID: <20230530183941.7223-21-quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration/qemu-file.h | 1 -
 migration/qemu-file.c | 2 +-
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/migration/qemu-file.h b/migration/qemu-file.h
index a081ef6c3f..8b8b7d27fe 100644
--- a/migration/qemu-file.h
+++ b/migration/qemu-file.h
@@ -128,7 +128,6 @@ void qemu_file_skip(QEMUFile *f, int size);
  * accounting information tracks the total migration traffic.
  */
 void qemu_file_credit_transfer(QEMUFile *f, size_t size);
-int qemu_file_get_error_obj(QEMUFile *f, Error **errp);
 int qemu_file_get_error_obj_any(QEMUFile *f1, QEMUFile *f2, Error **errp);
 void qemu_file_set_error_obj(QEMUFile *f, int ret, Error *err);
 void qemu_file_set_error(QEMUFile *f, int ret);
diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index 4c577bdff8..d30bf3c377 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -158,7 +158,7 @@ void qemu_file_set_hooks(QEMUFile *f, const QEMUFileHooks *hooks)
  * is not 0.
  *
  */
-int qemu_file_get_error_obj(QEMUFile *f, Error **errp)
+static int qemu_file_get_error_obj(QEMUFile *f, Error **errp)
 {
     if (errp) {
         *errp = f->last_error_obj ? error_copy(f->last_error_obj) : NULL;
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 26/26] migration/rdma: Split qemu_fopen_rdma() into input/output functions
  2023-07-24 13:06 [PATCH 00/26] Migration PULL 2023-07-24 Juan Quintela
                   ` (24 preceding siblings ...)
  2023-07-24 13:06 ` [PATCH 25/26] qemu-file: Make qemu_file_get_error_obj() static Juan Quintela
@ 2023-07-24 13:06 ` Juan Quintela
  2023-07-24 13:28 ` [PATCH 00/26] Migration PULL 2023-07-24 Thomas Huth
  26 siblings, 0 replies; 30+ messages in thread
From: Juan Quintela @ 2023-07-24 13:06 UTC (permalink / raw)
  To: qemu-devel
  Cc: Laurent Vivier, Thomas Huth, Markus Armbruster, libvir-list,
	Paolo Bonzini, Peter Xu, Eric Blake, Juan Quintela, Leonardo Bras

This is how everything else in QEMUFile is structured.
As a bonus they are three less lines of code.

Reviewed-by: Peter Xu <peterx@redhat.com>
Message-ID: <20230530183941.7223-17-quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration/qemu-file.h |  1 -
 migration/qemu-file.c | 12 ------------
 migration/rdma.c      | 39 +++++++++++++++++++--------------------
 3 files changed, 19 insertions(+), 33 deletions(-)

diff --git a/migration/qemu-file.h b/migration/qemu-file.h
index 8b8b7d27fe..47015f5201 100644
--- a/migration/qemu-file.h
+++ b/migration/qemu-file.h
@@ -102,7 +102,6 @@ uint64_t qemu_file_transferred_noflush(QEMUFile *f);
  */
 void qemu_put_buffer_async(QEMUFile *f, const uint8_t *buf, size_t size,
                            bool may_free);
-bool qemu_file_mode_is_not_valid(const char *mode);
 
 #include "migration/qemu-file-types.h"
 
diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index d30bf3c377..19c33c9985 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -100,18 +100,6 @@ int qemu_file_shutdown(QEMUFile *f)
     return 0;
 }
 
-bool qemu_file_mode_is_not_valid(const char *mode)
-{
-    if (mode == NULL ||
-        (mode[0] != 'r' && mode[0] != 'w') ||
-        mode[1] != 'b' || mode[2] != 0) {
-        fprintf(stderr, "qemu_fopen: Argument validity check failed\n");
-        return true;
-    }
-
-    return false;
-}
-
 static QEMUFile *qemu_file_new_impl(QIOChannel *ioc, bool is_writable)
 {
     QEMUFile *f;
diff --git a/migration/rdma.c b/migration/rdma.c
index dd1c039e6c..ca430d319d 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -4053,27 +4053,26 @@ static void qio_channel_rdma_register_types(void)
 
 type_init(qio_channel_rdma_register_types);
 
-static QEMUFile *qemu_fopen_rdma(RDMAContext *rdma, const char *mode)
+static QEMUFile *rdma_new_input(RDMAContext *rdma)
 {
-    QIOChannelRDMA *rioc;
+    QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(object_new(TYPE_QIO_CHANNEL_RDMA));
 
-    if (qemu_file_mode_is_not_valid(mode)) {
-        return NULL;
-    }
+    rioc->file = qemu_file_new_input(QIO_CHANNEL(rioc));
+    rioc->rdmain = rdma;
+    rioc->rdmaout = rdma->return_path;
+    qemu_file_set_hooks(rioc->file, &rdma_read_hooks);
 
-    rioc = QIO_CHANNEL_RDMA(object_new(TYPE_QIO_CHANNEL_RDMA));
+    return rioc->file;
+}
 
-    if (mode[0] == 'w') {
-        rioc->file = qemu_file_new_output(QIO_CHANNEL(rioc));
-        rioc->rdmaout = rdma;
-        rioc->rdmain = rdma->return_path;
-        qemu_file_set_hooks(rioc->file, &rdma_write_hooks);
-    } else {
-        rioc->file = qemu_file_new_input(QIO_CHANNEL(rioc));
-        rioc->rdmain = rdma;
-        rioc->rdmaout = rdma->return_path;
-        qemu_file_set_hooks(rioc->file, &rdma_read_hooks);
-    }
+static QEMUFile *rdma_new_output(RDMAContext *rdma)
+{
+    QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(object_new(TYPE_QIO_CHANNEL_RDMA));
+
+    rioc->file = qemu_file_new_output(QIO_CHANNEL(rioc));
+    rioc->rdmaout = rdma;
+    rioc->rdmain = rdma->return_path;
+    qemu_file_set_hooks(rioc->file, &rdma_write_hooks);
 
     return rioc->file;
 }
@@ -4099,9 +4098,9 @@ static void rdma_accept_incoming_migration(void *opaque)
         return;
     }
 
-    f = qemu_fopen_rdma(rdma, "rb");
+    f = rdma_new_input(rdma);
     if (f == NULL) {
-        fprintf(stderr, "RDMA ERROR: could not qemu_fopen_rdma\n");
+        fprintf(stderr, "RDMA ERROR: could not open RDMA for input\n");
         qemu_rdma_cleanup(rdma);
         return;
     }
@@ -4224,7 +4223,7 @@ void rdma_start_outgoing_migration(void *opaque,
 
     trace_rdma_start_outgoing_migration_after_rdma_connect();
 
-    s->to_dst_file = qemu_fopen_rdma(rdma, "wb");
+    s->to_dst_file = rdma_new_output(rdma);
     migrate_fd_connect(s, NULL);
     return;
 return_path_err:
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 00/26] Migration PULL 2023-07-24
  2023-07-24 13:06 [PATCH 00/26] Migration PULL 2023-07-24 Juan Quintela
                   ` (25 preceding siblings ...)
  2023-07-24 13:06 ` [PATCH 26/26] migration/rdma: Split qemu_fopen_rdma() into input/output functions Juan Quintela
@ 2023-07-24 13:28 ` Thomas Huth
  2023-07-31  6:56   ` Juan Quintela
  26 siblings, 1 reply; 30+ messages in thread
From: Thomas Huth @ 2023-07-24 13:28 UTC (permalink / raw)
  To: Juan Quintela, qemu-devel
  Cc: Laurent Vivier, Markus Armbruster, libvir-list, Paolo Bonzini,
	Peter Xu, Eric Blake, Leonardo Bras

On 24/07/2023 15.06, Juan Quintela wrote:
> Hi
> 
> This is the migration PULL request.

Maybe it would better to use "PULL" instead of "PATCH" in the subject?

> Now a not on CI, thas has been really bad.  After too many problems
> with last PULLS, I decided to learn to use qemu CI.  On one hand, it
> is not so difficult, even I can use it O:-)
> 
> On the other hand, the amount of problems that I got is inmense.  Some
> of them dissapear when I rerun the checks, but I never know if it is
> my PULL request, the CI system or the tests themselves.

I normally peek at https://gitlab.com/qemu-project/qemu/-/pipelines to see 
whether the problem occurred in one of the last staging CI runs already ... 
or I push the master branch to my own repo to see whether it reproduces with 
a clean state. That often helps in judging whether it's a new problem or a 
pre-existing one.

> This (last) patch is not part of the PULL request, but I have found
> that it _always_ makes gcov fail.  I had to use bisect to find where
> the problem was.
> 
> https://gitlab.com/juan.quintela/qemu/-/jobs/4571878922
> 
> I could use help to know how a change in test/qtest/migration-test.c
> can break block layer tests, I am all ears.
> 
> Yes, I tried several times.  It always fails on that patch.  The
> passes with flying colors.

Can you reproduce it locally by running "make check-block"?

The tests/qemu-iotests/tests/copy-before-write test seems to be doing some 
things with snapshots ... maybe that's related?

  Thomas



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 02/26] migration/multifd: Protect accesses to migration_threads
  2023-07-24 13:06 ` [PATCH 02/26] migration/multifd: Protect accesses to migration_threads Juan Quintela
@ 2023-07-24 13:29   ` Fabiano Rosas
  0 siblings, 0 replies; 30+ messages in thread
From: Fabiano Rosas @ 2023-07-24 13:29 UTC (permalink / raw)
  To: Juan Quintela, qemu-devel
  Cc: Laurent Vivier, Thomas Huth, Markus Armbruster, libvir-list,
	Paolo Bonzini, Peter Xu, Eric Blake, Juan Quintela, Leonardo Bras

Juan Quintela <quintela@redhat.com> writes:

> From: Fabiano Rosas <farosas@suse.de>
>
> This doubly linked list is common for all the multifd and migration
> threads so we need to avoid concurrent access.
>
> Add a mutex to protect the data from concurrent access. This fixes a
> crash when removing two MigrationThread objects from the list at the
> same time during cleanup of multifd threads.
>
> Fixes: 671326201d ("migration: Introduce interface query-migrationthreads")
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> Reviewed-by: Peter Xu <peterx@redhat.com>
> Reviewed-by: Juan Quintela <quintela@redhat.com>
> Message-Id: <20230607161306.31425-3-farosas@suse.de>
> Signed-off-by: Juan Quintela <quintela@redhat.com>

Hi Juan,

What about re-enabling the /multifd/tcp/plain/cancel test? You had
mentioned that something else was needed, but never said exactly
what...

I've been doing a lot of migration work recently and all of my branches
have this change and the cancel test enabled. No issues so far.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 00/26] Migration PULL 2023-07-24
  2023-07-24 13:28 ` [PATCH 00/26] Migration PULL 2023-07-24 Thomas Huth
@ 2023-07-31  6:56   ` Juan Quintela
  0 siblings, 0 replies; 30+ messages in thread
From: Juan Quintela @ 2023-07-31  6:56 UTC (permalink / raw)
  To: Thomas Huth
  Cc: qemu-devel, Laurent Vivier, Markus Armbruster, libvir-list,
	Paolo Bonzini, Peter Xu, Eric Blake, Leonardo Bras

Thomas Huth <thuth@redhat.com> wrote:
> On 24/07/2023 15.06, Juan Quintela wrote:
>> Hi
>> This is the migration PULL request.
>
> Maybe it would better to use "PULL" instead of "PATCH" in the subject?

Grrrr.

Resending.  Thanks.

>> Now a not on CI, thas has been really bad.  After too many problems
>> with last PULLS, I decided to learn to use qemu CI.  On one hand, it
>> is not so difficult, even I can use it O:-)
>> On the other hand, the amount of problems that I got is inmense.
>> Some
>> of them dissapear when I rerun the checks, but I never know if it is
>> my PULL request, the CI system or the tests themselves.
>
> I normally peek at https://gitlab.com/qemu-project/qemu/-/pipelines to
> see whether the problem occurred in one of the last staging CI runs
> already ... or I push the master branch to my own repo to see whether
> it reproduces with a clean state. That often helps in judging whether
> it's a new problem or a pre-existing one.

It don't happens for master branch at the time.  It only happens with my
changes.

But the change previous to that one runs well.  That one always fails in
the block layer.  And the changes on that "series" were only for
migration-test.c, so it shouldn't break any other tests.  No other files
are touched.

Yes, in the PULL request more files are touched, but the tests I was
doing on CI there weren't.

I have no clue what gcov is adding to those tests really (I know what
gcov is, not what gcov is trying to do on that test.)

>> This (last) patch is not part of the PULL request, but I have found
>> that it _always_ makes gcov fail.  I had to use bisect to find where
>> the problem was.
>> https://gitlab.com/juan.quintela/qemu/-/jobs/4571878922
>> I could use help to know how a change in test/qtest/migration-test.c
>> can break block layer tests, I am all ears.
>> Yes, I tried several times.  It always fails on that patch.  The
>> passes with flying colors.
>
> Can you reproduce it locally by running "make check-block"?

No.  make check with all architectures under the sun works as expected.

I have learn my lesson here, and know I have to terminals open.  One
compiles x86_64 natively and test natively.

The other compiles aarch64 and test it using TCG.

(I do more tests, but that is run after each patch got reviewed and
integrated for the PULL request)

> The tests/qemu-iotests/tests/copy-before-write test seems to be doing
> some things with snapshots ... maybe that's related?

It could.  But I am not changing that.  I am only changing
migration-test.c.

As Daniel answered on list, problably it is just a race that changing
timing makes it more probable.

Later, juan.




^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2023-07-31  6:57 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-07-24 13:06 [PATCH 00/26] Migration PULL 2023-07-24 Juan Quintela
2023-07-24 13:06 ` [PATCH 01/26] migration/multifd: Rename threadinfo.c functions Juan Quintela
2023-07-24 13:06 ` [PATCH 02/26] migration/multifd: Protect accesses to migration_threads Juan Quintela
2023-07-24 13:29   ` Fabiano Rosas
2023-07-24 13:06 ` [PATCH 03/26] softmmu/dirtylimit: Add parameter check for hmp "set_vcpu_dirty_limit" Juan Quintela
2023-07-24 13:06 ` [PATCH 04/26] qapi/migration: Introduce x-vcpu-dirty-limit-period parameter Juan Quintela
2023-07-24 13:06 ` [PATCH 05/26] qapi/migration: Introduce vcpu-dirty-limit parameters Juan Quintela
2023-07-24 13:06 ` [PATCH 06/26] migration: Introduce dirty-limit capability Juan Quintela
2023-07-24 13:06 ` [PATCH 07/26] migration: Refactor auto-converge capability logic Juan Quintela
2023-07-24 13:06 ` [PATCH 08/26] migration: Put the detection logic before auto-converge checking Juan Quintela
2023-07-24 13:06 ` [PATCH 09/26] migration: Implement dirty-limit convergence algo Juan Quintela
2023-07-24 13:06 ` [PATCH 10/26] migration: Extend query-migrate to provide dirty page limit info Juan Quintela
2023-07-24 13:06 ` [PATCH 11/26] migration-test: Be consistent for ppc Juan Quintela
2023-07-24 13:06 ` [PATCH 12/26] migration-test: Make machine_opts regular with other options Juan Quintela
2023-07-24 13:06 ` [PATCH 13/26] migration-test: Create arch_opts Juan Quintela
2023-07-24 13:06 ` [PATCH 14/26] migration-test: machine_opts is really arch specific Juan Quintela
2023-07-24 13:06 ` [PATCH 15/26] migration.json: Don't use space before colon Juan Quintela
2023-07-24 13:06 ` [PATCH 16/26] migration: skipped field is really obsolete Juan Quintela
2023-07-24 13:06 ` [PATCH 17/26] docs/migration: Update postcopy bits Juan Quintela
2023-07-24 13:06 ` [PATCH 18/26] migration: Update error description whenever migration fails Juan Quintela
2023-07-24 13:06 ` [PATCH 19/26] migration: enforce multifd and postcopy preempt to be set before incoming Juan Quintela
2023-07-24 13:06 ` [PATCH 20/26] qtest/migration-tests.c: use "-incoming defer" for postcopy tests Juan Quintela
2023-07-24 13:06 ` [PATCH 21/26] qemu-file: Rename qemu_file_transferred_ fast -> noflush Juan Quintela
2023-07-24 13:06 ` [PATCH 22/26] migration: Change qemu_file_transferred to noflush Juan Quintela
2023-07-24 13:06 ` [PATCH 23/26] qemu_file: Make qemu_file_is_writable() static Juan Quintela
2023-07-24 13:06 ` [PATCH 24/26] qemu-file: Simplify qemu_file_shutdown() Juan Quintela
2023-07-24 13:06 ` [PATCH 25/26] qemu-file: Make qemu_file_get_error_obj() static Juan Quintela
2023-07-24 13:06 ` [PATCH 26/26] migration/rdma: Split qemu_fopen_rdma() into input/output functions Juan Quintela
2023-07-24 13:28 ` [PATCH 00/26] Migration PULL 2023-07-24 Thomas Huth
2023-07-31  6:56   ` Juan Quintela

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).