* [PULL 01/25] docs/devel/migration.rst: Document the file transport
2024-02-28 5:12 [PULL 00/25] Migration next patches peterx
@ 2024-02-28 5:12 ` peterx
2024-02-28 5:12 ` [PULL 02/25] tests/qtest/migration: Rename fd_proto test peterx
` (24 subsequent siblings)
25 siblings, 0 replies; 37+ messages in thread
From: peterx @ 2024-02-28 5:12 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: David Hildenbrand, Eric Blake, Laurent Vivier, Paolo Bonzini,
Cédric Le Goater, Gerd Hoffmann, Marc-André Lureau,
Markus Armbruster, Fabiano Rosas, Michael S. Tsirkin, Thomas Huth,
Jason Wang, Alex Williamson, Peter Xu
From: Fabiano Rosas <farosas@suse.de>
When adding the support for file migration with the file: transport,
we missed adding documentation for it.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240220224138.24759-2-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
---
docs/devel/migration/main.rst | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/docs/devel/migration/main.rst b/docs/devel/migration/main.rst
index 331252a92c..8024275d6d 100644
--- a/docs/devel/migration/main.rst
+++ b/docs/devel/migration/main.rst
@@ -41,6 +41,10 @@ over any transport.
- exec migration: do the migration using the stdin/stdout through a process.
- fd migration: do the migration using a file descriptor that is
passed to QEMU. QEMU doesn't care how this file descriptor is opened.
+- file migration: do the migration using a file that is passed to QEMU
+ by path. A file offset option is supported to allow a management
+ application to add its own metadata to the start of the file without
+ QEMU interference.
In addition, support is included for migration using RDMA, which
transports the page data using ``RDMA``, where the hardware takes care of
--
2.43.0
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PULL 02/25] tests/qtest/migration: Rename fd_proto test
2024-02-28 5:12 [PULL 00/25] Migration next patches peterx
2024-02-28 5:12 ` [PULL 01/25] docs/devel/migration.rst: Document the file transport peterx
@ 2024-02-28 5:12 ` peterx
2024-02-28 5:12 ` [PULL 03/25] tests/qtest/migration: Add a fd + file test peterx
` (23 subsequent siblings)
25 siblings, 0 replies; 37+ messages in thread
From: peterx @ 2024-02-28 5:12 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: David Hildenbrand, Eric Blake, Laurent Vivier, Paolo Bonzini,
Cédric Le Goater, Gerd Hoffmann, Marc-André Lureau,
Markus Armbruster, Fabiano Rosas, Michael S. Tsirkin, Thomas Huth,
Jason Wang, Alex Williamson, Peter Xu
From: Fabiano Rosas <farosas@suse.de>
Next patch adds another fd test. Rename the existing one closer to
what's used on other tests, with the 'precopy' prefix.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240220224138.24759-3-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
---
tests/qtest/migration-test.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 8a5bb1752e..b729ce4d22 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -2423,7 +2423,7 @@ static void test_migrate_fd_finish_hook(QTestState *from,
qobject_unref(rsp);
}
-static void test_migrate_fd_proto(void)
+static void test_migrate_precopy_fd_socket(void)
{
MigrateCommon args = {
.listen_uri = "defer",
@@ -3527,7 +3527,8 @@ int main(int argc, char **argv)
/* migration_test_add("/migration/ignore_shared", test_ignore_shared); */
#ifndef _WIN32
- migration_test_add("/migration/fd_proto", test_migrate_fd_proto);
+ migration_test_add("/migration/precopy/fd/tcp",
+ test_migrate_precopy_fd_socket);
#endif
migration_test_add("/migration/validate_uuid", test_validate_uuid);
migration_test_add("/migration/validate_uuid_error",
--
2.43.0
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PULL 03/25] tests/qtest/migration: Add a fd + file test
2024-02-28 5:12 [PULL 00/25] Migration next patches peterx
2024-02-28 5:12 ` [PULL 01/25] docs/devel/migration.rst: Document the file transport peterx
2024-02-28 5:12 ` [PULL 02/25] tests/qtest/migration: Rename fd_proto test peterx
@ 2024-02-28 5:12 ` peterx
2024-02-28 5:12 ` [PULL 04/25] migration/multifd: Remove p->quit from recv side peterx
` (22 subsequent siblings)
25 siblings, 0 replies; 37+ messages in thread
From: peterx @ 2024-02-28 5:12 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: David Hildenbrand, Eric Blake, Laurent Vivier, Paolo Bonzini,
Cédric Le Goater, Gerd Hoffmann, Marc-André Lureau,
Markus Armbruster, Fabiano Rosas, Michael S. Tsirkin, Thomas Huth,
Jason Wang, Alex Williamson, Peter Xu
From: Fabiano Rosas <farosas@suse.de>
The fd URI supports an fd that is backed by a file. The code should
select between QIOChannelFile and QIOChannelSocket, depending on the
type of the fd. Add a test for that.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240220224138.24759-4-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
---
tests/qtest/migration-test.c | 41 ++++++++++++++++++++++++++++++++++++
1 file changed, 41 insertions(+)
diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index b729ce4d22..83512bce85 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -2433,6 +2433,45 @@ static void test_migrate_precopy_fd_socket(void)
};
test_precopy_common(&args);
}
+
+static void *migrate_precopy_fd_file_start(QTestState *from, QTestState *to)
+{
+ g_autofree char *file = g_strdup_printf("%s/%s", tmpfs, FILE_TEST_FILENAME);
+ int src_flags = O_CREAT | O_RDWR;
+ int dst_flags = O_CREAT | O_RDWR;
+ int fds[2];
+
+ fds[0] = open(file, src_flags, 0660);
+ assert(fds[0] != -1);
+
+ fds[1] = open(file, dst_flags, 0660);
+ assert(fds[1] != -1);
+
+
+ qtest_qmp_fds_assert_success(to, &fds[0], 1,
+ "{ 'execute': 'getfd',"
+ " 'arguments': { 'fdname': 'fd-mig' }}");
+
+ qtest_qmp_fds_assert_success(from, &fds[1], 1,
+ "{ 'execute': 'getfd',"
+ " 'arguments': { 'fdname': 'fd-mig' }}");
+
+ close(fds[0]);
+ close(fds[1]);
+
+ return NULL;
+}
+
+static void test_migrate_precopy_fd_file(void)
+{
+ MigrateCommon args = {
+ .listen_uri = "defer",
+ .connect_uri = "fd:fd-mig",
+ .start_hook = migrate_precopy_fd_file_start,
+ .finish_hook = test_migrate_fd_finish_hook
+ };
+ test_file_common(&args, true);
+}
#endif /* _WIN32 */
static void do_test_validate_uuid(MigrateStart *args, bool should_fail)
@@ -3529,6 +3568,8 @@ int main(int argc, char **argv)
#ifndef _WIN32
migration_test_add("/migration/precopy/fd/tcp",
test_migrate_precopy_fd_socket);
+ migration_test_add("/migration/precopy/fd/file",
+ test_migrate_precopy_fd_file);
#endif
migration_test_add("/migration/validate_uuid", test_validate_uuid);
migration_test_add("/migration/validate_uuid_error",
--
2.43.0
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PULL 04/25] migration/multifd: Remove p->quit from recv side
2024-02-28 5:12 [PULL 00/25] Migration next patches peterx
` (2 preceding siblings ...)
2024-02-28 5:12 ` [PULL 03/25] tests/qtest/migration: Add a fd + file test peterx
@ 2024-02-28 5:12 ` peterx
2024-02-28 5:12 ` [PULL 05/25] migration/multifd: Release recv sem_sync earlier peterx
` (21 subsequent siblings)
25 siblings, 0 replies; 37+ messages in thread
From: peterx @ 2024-02-28 5:12 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: David Hildenbrand, Eric Blake, Laurent Vivier, Paolo Bonzini,
Cédric Le Goater, Gerd Hoffmann, Marc-André Lureau,
Markus Armbruster, Fabiano Rosas, Michael S. Tsirkin, Thomas Huth,
Jason Wang, Alex Williamson, Peter Xu
From: Fabiano Rosas <farosas@suse.de>
Like we did on the sending side, replace the p->quit per-channel flag
with a global atomic 'exiting' flag.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240220224138.24759-5-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/multifd.c | 41 ++++++++++++++++++++++++-----------------
1 file changed, 24 insertions(+), 17 deletions(-)
diff --git a/migration/multifd.c b/migration/multifd.c
index adfe8c9a0a..fba00b9e8f 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -79,6 +79,19 @@ struct {
MultiFDMethods *ops;
} *multifd_send_state;
+struct {
+ MultiFDRecvParams *params;
+ /* number of created threads */
+ int count;
+ /* syncs main thread and channels */
+ QemuSemaphore sem_sync;
+ /* global number of generated multifd packets */
+ uint64_t packet_num;
+ int exiting;
+ /* multifd ops */
+ MultiFDMethods *ops;
+} *multifd_recv_state;
+
/* Multifd without compression */
/**
@@ -440,6 +453,11 @@ static bool multifd_send_should_exit(void)
return qatomic_read(&multifd_send_state->exiting);
}
+static bool multifd_recv_should_exit(void)
+{
+ return qatomic_read(&multifd_recv_state->exiting);
+}
+
/*
* The migration thread can wait on either of the two semaphores. This
* function can be used to kick the main thread out of waiting on either of
@@ -1063,24 +1081,16 @@ bool multifd_send_setup(void)
return true;
}
-struct {
- MultiFDRecvParams *params;
- /* number of created threads */
- int count;
- /* syncs main thread and channels */
- QemuSemaphore sem_sync;
- /* global number of generated multifd packets */
- uint64_t packet_num;
- /* multifd ops */
- MultiFDMethods *ops;
-} *multifd_recv_state;
-
static void multifd_recv_terminate_threads(Error *err)
{
int i;
trace_multifd_recv_terminate_threads(err != NULL);
+ if (qatomic_xchg(&multifd_recv_state->exiting, 1)) {
+ return;
+ }
+
if (err) {
MigrationState *s = migrate_get_current();
migrate_set_error(s, err);
@@ -1094,8 +1104,6 @@ static void multifd_recv_terminate_threads(Error *err)
for (i = 0; i < migrate_multifd_channels(); i++) {
MultiFDRecvParams *p = &multifd_recv_state->params[i];
- qemu_mutex_lock(&p->mutex);
- p->quit = true;
/*
* We could arrive here for two reasons:
* - normal quit, i.e. everything went fine, just finished
@@ -1105,7 +1113,6 @@ static void multifd_recv_terminate_threads(Error *err)
if (p->c) {
qio_channel_shutdown(p->c, QIO_CHANNEL_SHUTDOWN_BOTH, NULL);
}
- qemu_mutex_unlock(&p->mutex);
}
}
@@ -1210,7 +1217,7 @@ static void *multifd_recv_thread(void *opaque)
while (true) {
uint32_t flags;
- if (p->quit) {
+ if (multifd_recv_should_exit()) {
break;
}
@@ -1274,6 +1281,7 @@ int multifd_recv_setup(Error **errp)
multifd_recv_state = g_malloc0(sizeof(*multifd_recv_state));
multifd_recv_state->params = g_new0(MultiFDRecvParams, thread_count);
qatomic_set(&multifd_recv_state->count, 0);
+ qatomic_set(&multifd_recv_state->exiting, 0);
qemu_sem_init(&multifd_recv_state->sem_sync, 0);
multifd_recv_state->ops = multifd_ops[migrate_multifd_compression()];
@@ -1282,7 +1290,6 @@ int multifd_recv_setup(Error **errp)
qemu_mutex_init(&p->mutex);
qemu_sem_init(&p->sem_sync, 0);
- p->quit = false;
p->id = i;
p->packet_len = sizeof(MultiFDPacket_t)
+ sizeof(uint64_t) * page_count;
--
2.43.0
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PULL 05/25] migration/multifd: Release recv sem_sync earlier
2024-02-28 5:12 [PULL 00/25] Migration next patches peterx
` (3 preceding siblings ...)
2024-02-28 5:12 ` [PULL 04/25] migration/multifd: Remove p->quit from recv side peterx
@ 2024-02-28 5:12 ` peterx
2024-02-28 5:12 ` [PULL 06/25] migration/multifd: Cleanup TLS iochannel referencing peterx
` (20 subsequent siblings)
25 siblings, 0 replies; 37+ messages in thread
From: peterx @ 2024-02-28 5:12 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: David Hildenbrand, Eric Blake, Laurent Vivier, Paolo Bonzini,
Cédric Le Goater, Gerd Hoffmann, Marc-André Lureau,
Markus Armbruster, Fabiano Rosas, Michael S. Tsirkin, Thomas Huth,
Jason Wang, Alex Williamson, Peter Xu
From: Fabiano Rosas <farosas@suse.de>
Now that multifd_recv_terminate_threads() is called only once, release
the recv side sem_sync earlier like we do for the send side.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240220224138.24759-6-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/multifd.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/migration/multifd.c b/migration/multifd.c
index fba00b9e8f..43f0820996 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -1104,6 +1104,12 @@ static void multifd_recv_terminate_threads(Error *err)
for (i = 0; i < migrate_multifd_channels(); i++) {
MultiFDRecvParams *p = &multifd_recv_state->params[i];
+ /*
+ * multifd_recv_thread may hung at MULTIFD_FLAG_SYNC handle code,
+ * however try to wakeup it without harm in cleanup phase.
+ */
+ qemu_sem_post(&p->sem_sync);
+
/*
* We could arrive here for two reasons:
* - normal quit, i.e. everything went fine, just finished
@@ -1162,12 +1168,6 @@ void multifd_recv_cleanup(void)
for (i = 0; i < migrate_multifd_channels(); i++) {
MultiFDRecvParams *p = &multifd_recv_state->params[i];
- /*
- * multifd_recv_thread may hung at MULTIFD_FLAG_SYNC handle code,
- * however try to wakeup it without harm in cleanup phase.
- */
- qemu_sem_post(&p->sem_sync);
-
if (p->thread_created) {
qemu_thread_join(&p->thread);
}
--
2.43.0
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PULL 06/25] migration/multifd: Cleanup TLS iochannel referencing
2024-02-28 5:12 [PULL 00/25] Migration next patches peterx
` (4 preceding siblings ...)
2024-02-28 5:12 ` [PULL 05/25] migration/multifd: Release recv sem_sync earlier peterx
@ 2024-02-28 5:12 ` peterx
2024-02-28 5:12 ` [PULL 07/25] migration/multifd: Drop registered_yank peterx
` (19 subsequent siblings)
25 siblings, 0 replies; 37+ messages in thread
From: peterx @ 2024-02-28 5:12 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: David Hildenbrand, Eric Blake, Laurent Vivier, Paolo Bonzini,
Cédric Le Goater, Gerd Hoffmann, Marc-André Lureau,
Markus Armbruster, Fabiano Rosas, Michael S. Tsirkin, Thomas Huth,
Jason Wang, Alex Williamson, Peter Xu
From: Peter Xu <peterx@redhat.com>
Commit a1af605bd5 ("migration/multifd: fix hangup with TLS-Multifd due to
blocking handshake") introduced a thread for TLS channels, which will
resolve the issue on blocking the main thread. However in the same commit
p->c is slightly abused just to be able to pass over the pointer "p" into
the thread.
That's the major reason we'll need to conditionally free the io channel in
the fault paths.
To clean it up, using a separate structure to pass over both "p" and "tioc"
in the tls handshake thread. Then we can make it a rule that p->c will
never be set until the channel is completely setup. With that, we can drop
the tricky conditional unref of the io channel in the error path.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240222095301.171137-2-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/multifd.c | 37 +++++++++++++++++++++++--------------
1 file changed, 23 insertions(+), 14 deletions(-)
diff --git a/migration/multifd.c b/migration/multifd.c
index 43f0820996..84a6b9e58f 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -891,16 +891,22 @@ out:
static void multifd_new_send_channel_async(QIOTask *task, gpointer opaque);
+typedef struct {
+ MultiFDSendParams *p;
+ QIOChannelTLS *tioc;
+} MultiFDTLSThreadArgs;
+
static void *multifd_tls_handshake_thread(void *opaque)
{
- MultiFDSendParams *p = opaque;
- QIOChannelTLS *tioc = QIO_CHANNEL_TLS(p->c);
+ MultiFDTLSThreadArgs *args = opaque;
- qio_channel_tls_handshake(tioc,
+ qio_channel_tls_handshake(args->tioc,
multifd_new_send_channel_async,
- p,
+ args->p,
NULL,
NULL);
+ g_free(args);
+
return NULL;
}
@@ -910,6 +916,7 @@ static bool multifd_tls_channel_connect(MultiFDSendParams *p,
{
MigrationState *s = migrate_get_current();
const char *hostname = s->hostname;
+ MultiFDTLSThreadArgs *args;
QIOChannelTLS *tioc;
tioc = migration_tls_client_create(ioc, hostname, errp);
@@ -924,11 +931,14 @@ static bool multifd_tls_channel_connect(MultiFDSendParams *p,
object_unref(OBJECT(ioc));
trace_multifd_tls_outgoing_handshake_start(ioc, tioc, hostname);
qio_channel_set_name(QIO_CHANNEL(tioc), "multifd-tls-outgoing");
- p->c = QIO_CHANNEL(tioc);
+
+ args = g_new0(MultiFDTLSThreadArgs, 1);
+ args->tioc = tioc;
+ args->p = p;
p->tls_thread_created = true;
qemu_thread_create(&p->tls_thread, "multifd-tls-handshake-worker",
- multifd_tls_handshake_thread, p,
+ multifd_tls_handshake_thread, args,
QEMU_THREAD_JOINABLE);
return true;
}
@@ -941,6 +951,7 @@ static bool multifd_channel_connect(MultiFDSendParams *p,
migration_ioc_register_yank(ioc);
p->registered_yank = true;
+ /* Setup p->c only if the channel is completely setup */
p->c = ioc;
p->thread_created = true;
@@ -994,14 +1005,12 @@ out:
trace_multifd_new_send_channel_async_error(p->id, local_err);
multifd_send_set_error(local_err);
- if (!p->c) {
- /*
- * If no channel has been created, drop the initial
- * reference. Otherwise cleanup happens at
- * multifd_send_channel_destroy()
- */
- object_unref(OBJECT(ioc));
- }
+ /*
+ * For error cases (TLS or non-TLS), IO channel is always freed here
+ * rather than when cleanup multifd: since p->c is not set, multifd
+ * cleanup code doesn't even know its existence.
+ */
+ object_unref(OBJECT(ioc));
error_free(local_err);
}
--
2.43.0
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PULL 07/25] migration/multifd: Drop registered_yank
2024-02-28 5:12 [PULL 00/25] Migration next patches peterx
` (5 preceding siblings ...)
2024-02-28 5:12 ` [PULL 06/25] migration/multifd: Cleanup TLS iochannel referencing peterx
@ 2024-02-28 5:12 ` peterx
2024-02-28 5:12 ` [PULL 08/25] migration/multifd: Make multifd_channel_connect() return void peterx
` (18 subsequent siblings)
25 siblings, 0 replies; 37+ messages in thread
From: peterx @ 2024-02-28 5:12 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: David Hildenbrand, Eric Blake, Laurent Vivier, Paolo Bonzini,
Cédric Le Goater, Gerd Hoffmann, Marc-André Lureau,
Markus Armbruster, Fabiano Rosas, Michael S. Tsirkin, Thomas Huth,
Jason Wang, Alex Williamson, Peter Xu
From: Peter Xu <peterx@redhat.com>
With a clear definition of p->c protocol, where we only set it up if the
channel is fully established (TLS or non-TLS), registered_yank boolean will
have equal meaning of "p->c != NULL".
Drop registered_yank by checking p->c instead.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240222095301.171137-3-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/multifd.h | 2 --
migration/multifd.c | 7 +++----
2 files changed, 3 insertions(+), 6 deletions(-)
diff --git a/migration/multifd.h b/migration/multifd.h
index 8a1cad0996..b3fe27ae93 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -78,8 +78,6 @@ typedef struct {
bool tls_thread_created;
/* communication channel */
QIOChannel *c;
- /* is the yank function registered */
- bool registered_yank;
/* packet allocated len */
uint32_t packet_len;
/* guest page size */
diff --git a/migration/multifd.c b/migration/multifd.c
index 84a6b9e58f..1d039a4840 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -666,11 +666,11 @@ static int multifd_send_channel_destroy(QIOChannel *send)
static bool multifd_send_cleanup_channel(MultiFDSendParams *p, Error **errp)
{
- if (p->registered_yank) {
+ if (p->c) {
migration_ioc_unregister_yank(p->c);
+ multifd_send_channel_destroy(p->c);
+ p->c = NULL;
}
- multifd_send_channel_destroy(p->c);
- p->c = NULL;
qemu_sem_destroy(&p->sem);
qemu_sem_destroy(&p->sem_sync);
g_free(p->name);
@@ -950,7 +950,6 @@ static bool multifd_channel_connect(MultiFDSendParams *p,
qio_channel_set_delay(ioc, false);
migration_ioc_register_yank(ioc);
- p->registered_yank = true;
/* Setup p->c only if the channel is completely setup */
p->c = ioc;
--
2.43.0
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PULL 08/25] migration/multifd: Make multifd_channel_connect() return void
2024-02-28 5:12 [PULL 00/25] Migration next patches peterx
` (6 preceding siblings ...)
2024-02-28 5:12 ` [PULL 07/25] migration/multifd: Drop registered_yank peterx
@ 2024-02-28 5:12 ` peterx
2024-02-28 5:12 ` [PULL 09/25] migration/multifd: Cleanup outgoing_args in state destroy peterx
` (17 subsequent siblings)
25 siblings, 0 replies; 37+ messages in thread
From: peterx @ 2024-02-28 5:12 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: David Hildenbrand, Eric Blake, Laurent Vivier, Paolo Bonzini,
Cédric Le Goater, Gerd Hoffmann, Marc-André Lureau,
Markus Armbruster, Fabiano Rosas, Michael S. Tsirkin, Thomas Huth,
Jason Wang, Alex Williamson, Peter Xu, Avihai Horon
From: Peter Xu <peterx@redhat.com>
It never fails, drop the retval and also the Error**.
Suggested-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240222095301.171137-4-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/multifd.c | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/migration/multifd.c b/migration/multifd.c
index 1d039a4840..af89e05915 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -943,9 +943,7 @@ static bool multifd_tls_channel_connect(MultiFDSendParams *p,
return true;
}
-static bool multifd_channel_connect(MultiFDSendParams *p,
- QIOChannel *ioc,
- Error **errp)
+static void multifd_channel_connect(MultiFDSendParams *p, QIOChannel *ioc)
{
qio_channel_set_delay(ioc, false);
@@ -956,7 +954,6 @@ static bool multifd_channel_connect(MultiFDSendParams *p,
p->thread_created = true;
qemu_thread_create(&p->thread, p->name, multifd_send_thread, p,
QEMU_THREAD_JOINABLE);
- return true;
}
/*
@@ -988,7 +985,8 @@ static void multifd_new_send_channel_async(QIOTask *task, gpointer opaque)
return;
}
} else {
- ret = multifd_channel_connect(p, ioc, &local_err);
+ multifd_channel_connect(p, ioc);
+ ret = true;
}
out:
--
2.43.0
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PULL 09/25] migration/multifd: Cleanup outgoing_args in state destroy
2024-02-28 5:12 [PULL 00/25] Migration next patches peterx
` (7 preceding siblings ...)
2024-02-28 5:12 ` [PULL 08/25] migration/multifd: Make multifd_channel_connect() return void peterx
@ 2024-02-28 5:12 ` peterx
2024-02-28 5:13 ` [PULL 10/25] migration/multifd: Drop unnecessary helper to destroy IOC peterx
` (16 subsequent siblings)
25 siblings, 0 replies; 37+ messages in thread
From: peterx @ 2024-02-28 5:12 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: David Hildenbrand, Eric Blake, Laurent Vivier, Paolo Bonzini,
Cédric Le Goater, Gerd Hoffmann, Marc-André Lureau,
Markus Armbruster, Fabiano Rosas, Michael S. Tsirkin, Thomas Huth,
Jason Wang, Alex Williamson, Peter Xu
From: Peter Xu <peterx@redhat.com>
outgoing_args is a global cache of socket address to be reused in multifd.
Freeing the cache in per-channel destructor is more or less a hack. Move
it to multifd_send_cleanup_state() so it only get checked once. Use a
small helper to do so because it's internal of socket.c.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240222095301.171137-5-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/socket.h | 2 ++
migration/multifd.c | 1 +
migration/socket.c | 12 ++++++++----
3 files changed, 11 insertions(+), 4 deletions(-)
diff --git a/migration/socket.h b/migration/socket.h
index 5e4c33b8ea..5f52eddd4c 100644
--- a/migration/socket.h
+++ b/migration/socket.h
@@ -29,4 +29,6 @@ void socket_start_incoming_migration(SocketAddress *saddr, Error **errp);
void socket_start_outgoing_migration(MigrationState *s,
SocketAddress *saddr, Error **errp);
+void socket_cleanup_outgoing_migration(void);
+
#endif
diff --git a/migration/multifd.c b/migration/multifd.c
index af89e05915..fa33fd98b4 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -689,6 +689,7 @@ static bool multifd_send_cleanup_channel(MultiFDSendParams *p, Error **errp)
static void multifd_send_cleanup_state(void)
{
+ socket_cleanup_outgoing_migration();
qemu_sem_destroy(&multifd_send_state->channels_created);
qemu_sem_destroy(&multifd_send_state->channels_ready);
g_free(multifd_send_state->params);
diff --git a/migration/socket.c b/migration/socket.c
index 98e3ea1514..3184c7c3c1 100644
--- a/migration/socket.c
+++ b/migration/socket.c
@@ -64,10 +64,6 @@ int socket_send_channel_destroy(QIOChannel *send)
{
/* Remove channel */
object_unref(OBJECT(send));
- if (outgoing_args.saddr) {
- qapi_free_SocketAddress(outgoing_args.saddr);
- outgoing_args.saddr = NULL;
- }
return 0;
}
@@ -137,6 +133,14 @@ void socket_start_outgoing_migration(MigrationState *s,
NULL);
}
+void socket_cleanup_outgoing_migration(void)
+{
+ if (outgoing_args.saddr) {
+ qapi_free_SocketAddress(outgoing_args.saddr);
+ outgoing_args.saddr = NULL;
+ }
+}
+
static void socket_accept_incoming_migration(QIONetListener *listener,
QIOChannelSocket *cioc,
gpointer opaque)
--
2.43.0
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PULL 10/25] migration/multifd: Drop unnecessary helper to destroy IOC
2024-02-28 5:12 [PULL 00/25] Migration next patches peterx
` (8 preceding siblings ...)
2024-02-28 5:12 ` [PULL 09/25] migration/multifd: Cleanup outgoing_args in state destroy peterx
@ 2024-02-28 5:13 ` peterx
2024-02-28 5:13 ` [PULL 11/25] notify: pass error to notifier with return peterx
` (15 subsequent siblings)
25 siblings, 0 replies; 37+ messages in thread
From: peterx @ 2024-02-28 5:13 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: David Hildenbrand, Eric Blake, Laurent Vivier, Paolo Bonzini,
Cédric Le Goater, Gerd Hoffmann, Marc-André Lureau,
Markus Armbruster, Fabiano Rosas, Michael S. Tsirkin, Thomas Huth,
Jason Wang, Alex Williamson, Peter Xu
From: Peter Xu <peterx@redhat.com>
Both socket_send_channel_destroy() and multifd_send_channel_destroy() are
unnecessary wrappers to destroy an IOC, as the only thing to do is to
release the final IOC reference. We have plenty of code that destroys an
IOC using direct unref() already; keep that style.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240222095301.171137-6-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/socket.h | 1 -
migration/multifd.c | 7 +------
migration/socket.c | 7 -------
3 files changed, 1 insertion(+), 14 deletions(-)
diff --git a/migration/socket.h b/migration/socket.h
index 5f52eddd4c..46c233ecd2 100644
--- a/migration/socket.h
+++ b/migration/socket.h
@@ -23,7 +23,6 @@
void socket_send_channel_create(QIOTaskFunc f, void *data);
QIOChannel *socket_send_channel_create_sync(Error **errp);
-int socket_send_channel_destroy(QIOChannel *send);
void socket_start_incoming_migration(SocketAddress *saddr, Error **errp);
diff --git a/migration/multifd.c b/migration/multifd.c
index fa33fd98b4..6c07f19af1 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -659,16 +659,11 @@ static void multifd_send_terminate_threads(void)
}
}
-static int multifd_send_channel_destroy(QIOChannel *send)
-{
- return socket_send_channel_destroy(send);
-}
-
static bool multifd_send_cleanup_channel(MultiFDSendParams *p, Error **errp)
{
if (p->c) {
migration_ioc_unregister_yank(p->c);
- multifd_send_channel_destroy(p->c);
+ object_unref(OBJECT(p->c));
p->c = NULL;
}
qemu_sem_destroy(&p->sem);
diff --git a/migration/socket.c b/migration/socket.c
index 3184c7c3c1..9ab89b1e08 100644
--- a/migration/socket.c
+++ b/migration/socket.c
@@ -60,13 +60,6 @@ QIOChannel *socket_send_channel_create_sync(Error **errp)
return QIO_CHANNEL(sioc);
}
-int socket_send_channel_destroy(QIOChannel *send)
-{
- /* Remove channel */
- object_unref(OBJECT(send));
- return 0;
-}
-
struct SocketConnectData {
MigrationState *s;
char *hostname;
--
2.43.0
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PULL 11/25] notify: pass error to notifier with return
2024-02-28 5:12 [PULL 00/25] Migration next patches peterx
` (9 preceding siblings ...)
2024-02-28 5:13 ` [PULL 10/25] migration/multifd: Drop unnecessary helper to destroy IOC peterx
@ 2024-02-28 5:13 ` peterx
2024-02-28 5:13 ` [PULL 12/25] migration: remove error from notifier data peterx
` (14 subsequent siblings)
25 siblings, 0 replies; 37+ messages in thread
From: peterx @ 2024-02-28 5:13 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: David Hildenbrand, Eric Blake, Laurent Vivier, Paolo Bonzini,
Cédric Le Goater, Gerd Hoffmann, Marc-André Lureau,
Markus Armbruster, Fabiano Rosas, Michael S. Tsirkin, Thomas Huth,
Jason Wang, Alex Williamson, Peter Xu, Steve Sistare
From: Steve Sistare <steven.sistare@oracle.com>
Pass an error object as the third parameter to "notifier with return"
notifiers, so clients no longer need to bundle an error object in the
opaque data. The new parameter is used in a later patch.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/1708622920-68779-2-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
include/qemu/notify.h | 7 +++++--
hw/virtio/vhost-user.c | 2 +-
hw/virtio/virtio-balloon.c | 3 ++-
migration/postcopy-ram.c | 2 +-
migration/ram.c | 2 +-
util/notify.c | 5 +++--
6 files changed, 13 insertions(+), 8 deletions(-)
diff --git a/include/qemu/notify.h b/include/qemu/notify.h
index bcfa70fb2e..9a85631864 100644
--- a/include/qemu/notify.h
+++ b/include/qemu/notify.h
@@ -45,12 +45,15 @@ bool notifier_list_empty(NotifierList *list);
/* Same as Notifier but allows .notify() to return errors */
typedef struct NotifierWithReturn NotifierWithReturn;
+typedef int (*NotifierWithReturnFunc)(NotifierWithReturn *notifier, void *data,
+ Error **errp);
+
struct NotifierWithReturn {
/**
* Return 0 on success (next notifier will be invoked), otherwise
* notifier_with_return_list_notify() will stop and return the value.
*/
- int (*notify)(NotifierWithReturn *notifier, void *data);
+ NotifierWithReturnFunc notify;
QLIST_ENTRY(NotifierWithReturn) node;
};
@@ -69,6 +72,6 @@ void notifier_with_return_list_add(NotifierWithReturnList *list,
void notifier_with_return_remove(NotifierWithReturn *notifier);
int notifier_with_return_list_notify(NotifierWithReturnList *list,
- void *data);
+ void *data, Error **errp);
#endif
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index f214df804b..f502345f37 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -2084,7 +2084,7 @@ static int vhost_user_postcopy_end(struct vhost_dev *dev, Error **errp)
}
static int vhost_user_postcopy_notifier(NotifierWithReturn *notifier,
- void *opaque)
+ void *opaque, Error **errp)
{
struct PostcopyNotifyData *pnd = opaque;
struct vhost_user *u = container_of(notifier, struct vhost_user,
diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 486fe3da32..89f853fa9e 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -633,7 +633,8 @@ static void virtio_balloon_free_page_done(VirtIOBalloon *s)
}
static int
-virtio_balloon_free_page_hint_notify(NotifierWithReturn *n, void *data)
+virtio_balloon_free_page_hint_notify(NotifierWithReturn *n, void *data,
+ Error **errp)
{
VirtIOBalloon *dev = container_of(n, VirtIOBalloon, free_page_hint_notify);
VirtIODevice *vdev = VIRTIO_DEVICE(dev);
diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
index 893ec8fa89..3ab2f6b8fd 100644
--- a/migration/postcopy-ram.c
+++ b/migration/postcopy-ram.c
@@ -80,7 +80,7 @@ int postcopy_notify(enum PostcopyNotifyReason reason, Error **errp)
pnd.errp = errp;
return notifier_with_return_list_notify(&postcopy_notifier_list,
- &pnd);
+ &pnd, errp);
}
/*
diff --git a/migration/ram.c b/migration/ram.c
index 4649a81204..5b6b09edd9 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -428,7 +428,7 @@ int precopy_notify(PrecopyNotifyReason reason, Error **errp)
pnd.reason = reason;
pnd.errp = errp;
- return notifier_with_return_list_notify(&precopy_notifier_list, &pnd);
+ return notifier_with_return_list_notify(&precopy_notifier_list, &pnd, errp);
}
uint64_t ram_bytes_remaining(void)
diff --git a/util/notify.c b/util/notify.c
index 76bab212ae..c6e158ffb3 100644
--- a/util/notify.c
+++ b/util/notify.c
@@ -61,13 +61,14 @@ void notifier_with_return_remove(NotifierWithReturn *notifier)
QLIST_REMOVE(notifier, node);
}
-int notifier_with_return_list_notify(NotifierWithReturnList *list, void *data)
+int notifier_with_return_list_notify(NotifierWithReturnList *list, void *data,
+ Error **errp)
{
NotifierWithReturn *notifier, *next;
int ret = 0;
QLIST_FOREACH_SAFE(notifier, &list->notifiers, node, next) {
- ret = notifier->notify(notifier, data);
+ ret = notifier->notify(notifier, data, errp);
if (ret != 0) {
break;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PULL 12/25] migration: remove error from notifier data
2024-02-28 5:12 [PULL 00/25] Migration next patches peterx
` (10 preceding siblings ...)
2024-02-28 5:13 ` [PULL 11/25] notify: pass error to notifier with return peterx
@ 2024-02-28 5:13 ` peterx
2024-02-28 5:13 ` [PULL 13/25] migration: convert to NotifierWithReturn peterx
` (13 subsequent siblings)
25 siblings, 0 replies; 37+ messages in thread
From: peterx @ 2024-02-28 5:13 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: David Hildenbrand, Eric Blake, Laurent Vivier, Paolo Bonzini,
Cédric Le Goater, Gerd Hoffmann, Marc-André Lureau,
Markus Armbruster, Fabiano Rosas, Michael S. Tsirkin, Thomas Huth,
Jason Wang, Alex Williamson, Peter Xu, Steve Sistare
From: Steve Sistare <steven.sistare@oracle.com>
Remove the error object from opaque data passed to notifiers.
Use the new error parameter passed to the notifier instead.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/1708622920-68779-3-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
include/migration/misc.h | 1 -
migration/postcopy-ram.h | 1 -
hw/virtio/vhost-user.c | 8 ++++----
migration/postcopy-ram.c | 1 -
migration/ram.c | 1 -
5 files changed, 4 insertions(+), 8 deletions(-)
diff --git a/include/migration/misc.h b/include/migration/misc.h
index 1bc8902e6d..5e65c18f1a 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -31,7 +31,6 @@ typedef enum PrecopyNotifyReason {
typedef struct PrecopyNotifyData {
enum PrecopyNotifyReason reason;
- Error **errp;
} PrecopyNotifyData;
void precopy_infrastructure_init(void);
diff --git a/migration/postcopy-ram.h b/migration/postcopy-ram.h
index 442ab89752..ecae941211 100644
--- a/migration/postcopy-ram.h
+++ b/migration/postcopy-ram.h
@@ -128,7 +128,6 @@ enum PostcopyNotifyReason {
struct PostcopyNotifyData {
enum PostcopyNotifyReason reason;
- Error **errp;
};
void postcopy_add_notifier(NotifierWithReturn *nn);
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index f502345f37..a1eea8547e 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -2096,20 +2096,20 @@ static int vhost_user_postcopy_notifier(NotifierWithReturn *notifier,
if (!virtio_has_feature(dev->protocol_features,
VHOST_USER_PROTOCOL_F_PAGEFAULT)) {
/* TODO: Get the device name into this error somehow */
- error_setg(pnd->errp,
+ error_setg(errp,
"vhost-user backend not capable of postcopy");
return -ENOENT;
}
break;
case POSTCOPY_NOTIFY_INBOUND_ADVISE:
- return vhost_user_postcopy_advise(dev, pnd->errp);
+ return vhost_user_postcopy_advise(dev, errp);
case POSTCOPY_NOTIFY_INBOUND_LISTEN:
- return vhost_user_postcopy_listen(dev, pnd->errp);
+ return vhost_user_postcopy_listen(dev, errp);
case POSTCOPY_NOTIFY_INBOUND_END:
- return vhost_user_postcopy_end(dev, pnd->errp);
+ return vhost_user_postcopy_end(dev, errp);
default:
/* We ignore notifications we don't know */
diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
index 3ab2f6b8fd..0273dc6a94 100644
--- a/migration/postcopy-ram.c
+++ b/migration/postcopy-ram.c
@@ -77,7 +77,6 @@ int postcopy_notify(enum PostcopyNotifyReason reason, Error **errp)
{
struct PostcopyNotifyData pnd;
pnd.reason = reason;
- pnd.errp = errp;
return notifier_with_return_list_notify(&postcopy_notifier_list,
&pnd, errp);
diff --git a/migration/ram.c b/migration/ram.c
index 5b6b09edd9..45a00b45ed 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -426,7 +426,6 @@ int precopy_notify(PrecopyNotifyReason reason, Error **errp)
{
PrecopyNotifyData pnd;
pnd.reason = reason;
- pnd.errp = errp;
return notifier_with_return_list_notify(&precopy_notifier_list, &pnd, errp);
}
--
2.43.0
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PULL 13/25] migration: convert to NotifierWithReturn
2024-02-28 5:12 [PULL 00/25] Migration next patches peterx
` (11 preceding siblings ...)
2024-02-28 5:13 ` [PULL 12/25] migration: remove error from notifier data peterx
@ 2024-02-28 5:13 ` peterx
2024-02-28 5:13 ` [PULL 14/25] migration: MigrationEvent for notifiers peterx
` (12 subsequent siblings)
25 siblings, 0 replies; 37+ messages in thread
From: peterx @ 2024-02-28 5:13 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: David Hildenbrand, Eric Blake, Laurent Vivier, Paolo Bonzini,
Cédric Le Goater, Gerd Hoffmann, Marc-André Lureau,
Markus Armbruster, Fabiano Rosas, Michael S. Tsirkin, Thomas Huth,
Jason Wang, Alex Williamson, Peter Xu, Steve Sistare
From: Steve Sistare <steven.sistare@oracle.com>
Change all migration notifiers to type NotifierWithReturn, so notifiers
can return an error status in a future patch. For now, pass NULL for the
notifier error parameter, and do not check the return value.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/1708622920-68779-4-git-send-email-steven.sistare@oracle.com
[peterx: dropped unexpected update to roms/seabios-hppa]
Signed-off-by: Peter Xu <peterx@redhat.com>
---
include/hw/vfio/vfio-common.h | 2 +-
include/hw/virtio/virtio-net.h | 2 +-
include/migration/misc.h | 6 +++---
include/qemu/notify.h | 1 +
hw/net/virtio-net.c | 4 +++-
hw/vfio/migration.c | 4 +++-
migration/migration.c | 16 ++++++++--------
net/vhost-vdpa.c | 6 ++++--
ui/spice-core.c | 8 +++++---
9 files changed, 29 insertions(+), 20 deletions(-)
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 9b7ef7d02b..4a6c262f77 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -62,7 +62,7 @@ typedef struct VFIORegion {
typedef struct VFIOMigration {
struct VFIODevice *vbasedev;
VMChangeStateEntry *vm_state;
- Notifier migration_state;
+ NotifierWithReturn migration_state;
uint32_t device_state;
int data_fd;
void *data_buffer;
diff --git a/include/hw/virtio/virtio-net.h b/include/hw/virtio/virtio-net.h
index 55977f01f0..eaee8f4243 100644
--- a/include/hw/virtio/virtio-net.h
+++ b/include/hw/virtio/virtio-net.h
@@ -221,7 +221,7 @@ struct VirtIONet {
DeviceListener primary_listener;
QDict *primary_opts;
bool primary_opts_from_json;
- Notifier migration_state;
+ NotifierWithReturn migration_state;
VirtioNetRssData rss_data;
struct NetRxPkt *rx_pkt;
struct EBPFRSSContext ebpf_rss;
diff --git a/include/migration/misc.h b/include/migration/misc.h
index 5e65c18f1a..b62e351d96 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -60,9 +60,9 @@ void migration_object_init(void);
void migration_shutdown(void);
bool migration_is_idle(void);
bool migration_is_active(MigrationState *);
-void migration_add_notifier(Notifier *notify,
- void (*func)(Notifier *notifier, void *data));
-void migration_remove_notifier(Notifier *notify);
+void migration_add_notifier(NotifierWithReturn *notify,
+ NotifierWithReturnFunc func);
+void migration_remove_notifier(NotifierWithReturn *notify);
void migration_call_notifiers(MigrationState *s);
bool migration_in_setup(MigrationState *);
bool migration_has_finished(MigrationState *);
diff --git a/include/qemu/notify.h b/include/qemu/notify.h
index 9a85631864..abf18dbf59 100644
--- a/include/qemu/notify.h
+++ b/include/qemu/notify.h
@@ -45,6 +45,7 @@ bool notifier_list_empty(NotifierList *list);
/* Same as Notifier but allows .notify() to return errors */
typedef struct NotifierWithReturn NotifierWithReturn;
+/* Return int to allow for different failure modes and recovery actions */
typedef int (*NotifierWithReturnFunc)(NotifierWithReturn *notifier, void *data,
Error **errp);
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 5a79bc3a3a..75f4e8664d 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -3534,11 +3534,13 @@ static void virtio_net_handle_migration_primary(VirtIONet *n, MigrationState *s)
}
}
-static void virtio_net_migration_state_notifier(Notifier *notifier, void *data)
+static int virtio_net_migration_state_notifier(NotifierWithReturn *notifier,
+ void *data, Error **errp)
{
MigrationState *s = data;
VirtIONet *n = container_of(notifier, VirtIONet, migration_state);
virtio_net_handle_migration_primary(n, s);
+ return 0;
}
static bool failover_hide_primary_device(DeviceListener *listener,
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 70e6b1a709..6b6acc4140 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -754,7 +754,8 @@ static void vfio_vmstate_change(void *opaque, bool running, RunState state)
mig_state_to_str(new_state));
}
-static void vfio_migration_state_notifier(Notifier *notifier, void *data)
+static int vfio_migration_state_notifier(NotifierWithReturn *notifier,
+ void *data, Error **errp)
{
MigrationState *s = data;
VFIOMigration *migration = container_of(notifier, VFIOMigration,
@@ -770,6 +771,7 @@ static void vfio_migration_state_notifier(Notifier *notifier, void *data)
case MIGRATION_STATUS_FAILED:
vfio_migration_set_state_or_reset(vbasedev, VFIO_DEVICE_STATE_RUNNING);
}
+ return 0;
}
static void vfio_migration_free(VFIODevice *vbasedev)
diff --git a/migration/migration.c b/migration/migration.c
index ab21de2cad..6d4072e8e9 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -69,8 +69,8 @@
#include "qemu/sockets.h"
#include "sysemu/kvm.h"
-static NotifierList migration_state_notifiers =
- NOTIFIER_LIST_INITIALIZER(migration_state_notifiers);
+static NotifierWithReturnList migration_state_notifiers =
+ NOTIFIER_WITH_RETURN_LIST_INITIALIZER(migration_state_notifiers);
/* Messages sent on the return path from destination to source */
enum mig_rp_message_type {
@@ -1459,24 +1459,24 @@ static void migrate_fd_cancel(MigrationState *s)
}
}
-void migration_add_notifier(Notifier *notify,
- void (*func)(Notifier *notifier, void *data))
+void migration_add_notifier(NotifierWithReturn *notify,
+ NotifierWithReturnFunc func)
{
notify->notify = func;
- notifier_list_add(&migration_state_notifiers, notify);
+ notifier_with_return_list_add(&migration_state_notifiers, notify);
}
-void migration_remove_notifier(Notifier *notify)
+void migration_remove_notifier(NotifierWithReturn *notify)
{
if (notify->notify) {
- notifier_remove(notify);
+ notifier_with_return_remove(notify);
notify->notify = NULL;
}
}
void migration_call_notifiers(MigrationState *s)
{
- notifier_list_notify(&migration_state_notifiers, s);
+ notifier_with_return_list_notify(&migration_state_notifiers, s, 0);
}
bool migration_in_setup(MigrationState *s)
diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 3726ee5d67..1c00519f10 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -34,7 +34,7 @@
typedef struct VhostVDPAState {
NetClientState nc;
struct vhost_vdpa vhost_vdpa;
- Notifier migration_state;
+ NotifierWithReturn migration_state;
VHostNetState *vhost_net;
/* Control commands shadow buffers */
@@ -322,7 +322,8 @@ static void vhost_vdpa_net_log_global_enable(VhostVDPAState *s, bool enable)
}
}
-static void vdpa_net_migration_state_notifier(Notifier *notifier, void *data)
+static int vdpa_net_migration_state_notifier(NotifierWithReturn *notifier,
+ void *data, Error **errp)
{
MigrationState *migration = data;
VhostVDPAState *s = container_of(notifier, VhostVDPAState,
@@ -333,6 +334,7 @@ static void vdpa_net_migration_state_notifier(Notifier *notifier, void *data)
} else if (migration_has_failed(migration)) {
vhost_vdpa_net_log_global_enable(s, false);
}
+ return 0;
}
static void vhost_vdpa_net_data_start_first(VhostVDPAState *s)
diff --git a/ui/spice-core.c b/ui/spice-core.c
index 37b277fd09..b3cd229023 100644
--- a/ui/spice-core.c
+++ b/ui/spice-core.c
@@ -42,7 +42,7 @@
/* core bits */
static SpiceServer *spice_server;
-static Notifier migration_state;
+static NotifierWithReturn migration_state;
static const char *auth = "spice";
static char *auth_passwd;
static time_t auth_expires = TIME_MAX;
@@ -568,12 +568,13 @@ static SpiceInfo *qmp_query_spice_real(Error **errp)
return info;
}
-static void migration_state_notifier(Notifier *notifier, void *data)
+static int migration_state_notifier(NotifierWithReturn *notifier,
+ void *data, Error **errp)
{
MigrationState *s = data;
if (!spice_have_target_host) {
- return;
+ return 0;
}
if (migration_in_setup(s)) {
@@ -586,6 +587,7 @@ static void migration_state_notifier(Notifier *notifier, void *data)
spice_server_migrate_end(spice_server, false);
spice_have_target_host = false;
}
+ return 0;
}
int qemu_spice_migrate_info(const char *hostname, int port, int tls_port,
--
2.43.0
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PULL 14/25] migration: MigrationEvent for notifiers
2024-02-28 5:12 [PULL 00/25] Migration next patches peterx
` (12 preceding siblings ...)
2024-02-28 5:13 ` [PULL 13/25] migration: convert to NotifierWithReturn peterx
@ 2024-02-28 5:13 ` peterx
2024-02-28 5:13 ` [PULL 15/25] migration: remove postcopy_after_devices peterx
` (11 subsequent siblings)
25 siblings, 0 replies; 37+ messages in thread
From: peterx @ 2024-02-28 5:13 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: David Hildenbrand, Eric Blake, Laurent Vivier, Paolo Bonzini,
Cédric Le Goater, Gerd Hoffmann, Marc-André Lureau,
Markus Armbruster, Fabiano Rosas, Michael S. Tsirkin, Thomas Huth,
Jason Wang, Alex Williamson, Peter Xu, Steve Sistare
From: Steve Sistare <steven.sistare@oracle.com>
Passing MigrationState to notifiers is unsound because they could access
unstable migration state internals or even modify the state. Instead, pass
the minimal info needed in a new MigrationEvent struct, which could be
extended in the future if needed.
Suggested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/1708622920-68779-5-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
include/migration/misc.h | 23 ++++++++++++++++++++++-
hw/net/virtio-net.c | 11 ++++++-----
hw/vfio/migration.c | 10 +++-------
migration/migration.c | 17 ++++++++++++-----
net/vhost-vdpa.c | 6 +++---
ui/spice-core.c | 9 ++++-----
hw/vfio/trace-events | 2 +-
7 files changed, 51 insertions(+), 27 deletions(-)
diff --git a/include/migration/misc.h b/include/migration/misc.h
index b62e351d96..9e4abae97f 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -60,10 +60,31 @@ void migration_object_init(void);
void migration_shutdown(void);
bool migration_is_idle(void);
bool migration_is_active(MigrationState *);
+
+typedef enum MigrationEventType {
+ MIG_EVENT_PRECOPY_SETUP,
+ MIG_EVENT_PRECOPY_DONE,
+ MIG_EVENT_PRECOPY_FAILED,
+ MIG_EVENT_MAX
+} MigrationEventType;
+
+typedef struct MigrationEvent {
+ MigrationEventType type;
+} MigrationEvent;
+
+/*
+ * Register the notifier @notify to be called when a migration event occurs
+ * for MIG_MODE_NORMAL, as specified by the MigrationEvent passed to @func.
+ * Notifiers may receive events in any of the following orders:
+ * - MIG_EVENT_PRECOPY_SETUP -> MIG_EVENT_PRECOPY_DONE
+ * - MIG_EVENT_PRECOPY_SETUP -> MIG_EVENT_PRECOPY_FAILED
+ * - MIG_EVENT_PRECOPY_FAILED
+ */
void migration_add_notifier(NotifierWithReturn *notify,
NotifierWithReturnFunc func);
+
void migration_remove_notifier(NotifierWithReturn *notify);
-void migration_call_notifiers(MigrationState *s);
+void migration_call_notifiers(MigrationState *s, MigrationEventType type);
bool migration_in_setup(MigrationState *);
bool migration_has_finished(MigrationState *);
bool migration_has_failed(MigrationState *);
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 75f4e8664d..e803f98c3a 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -3504,7 +3504,7 @@ out:
return !err;
}
-static void virtio_net_handle_migration_primary(VirtIONet *n, MigrationState *s)
+static void virtio_net_handle_migration_primary(VirtIONet *n, MigrationEvent *e)
{
bool should_be_hidden;
Error *err = NULL;
@@ -3516,7 +3516,7 @@ static void virtio_net_handle_migration_primary(VirtIONet *n, MigrationState *s)
should_be_hidden = qatomic_read(&n->failover_primary_hidden);
- if (migration_in_setup(s) && !should_be_hidden) {
+ if (e->type == MIG_EVENT_PRECOPY_SETUP && !should_be_hidden) {
if (failover_unplug_primary(n, dev)) {
vmstate_unregister(VMSTATE_IF(dev), qdev_get_vmsd(dev), dev);
qapi_event_send_unplug_primary(dev->id);
@@ -3524,7 +3524,7 @@ static void virtio_net_handle_migration_primary(VirtIONet *n, MigrationState *s)
} else {
warn_report("couldn't unplug primary device");
}
- } else if (migration_has_failed(s)) {
+ } else if (e->type == MIG_EVENT_PRECOPY_FAILED) {
/* We already unplugged the device let's plug it back */
if (!failover_replug_primary(n, dev, &err)) {
if (err) {
@@ -3537,9 +3537,10 @@ static void virtio_net_handle_migration_primary(VirtIONet *n, MigrationState *s)
static int virtio_net_migration_state_notifier(NotifierWithReturn *notifier,
void *data, Error **errp)
{
- MigrationState *s = data;
+ MigrationEvent *e = data;
+
VirtIONet *n = container_of(notifier, VirtIONet, migration_state);
- virtio_net_handle_migration_primary(n, s);
+ virtio_net_handle_migration_primary(n, e);
return 0;
}
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 6b6acc4140..869d8417d6 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -757,18 +757,14 @@ static void vfio_vmstate_change(void *opaque, bool running, RunState state)
static int vfio_migration_state_notifier(NotifierWithReturn *notifier,
void *data, Error **errp)
{
- MigrationState *s = data;
+ MigrationEvent *e = data;
VFIOMigration *migration = container_of(notifier, VFIOMigration,
migration_state);
VFIODevice *vbasedev = migration->vbasedev;
- trace_vfio_migration_state_notifier(vbasedev->name,
- MigrationStatus_str(s->state));
+ trace_vfio_migration_state_notifier(vbasedev->name, e->type);
- switch (s->state) {
- case MIGRATION_STATUS_CANCELLING:
- case MIGRATION_STATUS_CANCELLED:
- case MIGRATION_STATUS_FAILED:
+ if (e->type == MIG_EVENT_PRECOPY_FAILED) {
vfio_migration_set_state_or_reset(vbasedev, VFIO_DEVICE_STATE_RUNNING);
}
return 0;
diff --git a/migration/migration.c b/migration/migration.c
index 6d4072e8e9..4650c21f67 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1319,6 +1319,8 @@ void migrate_set_state(int *state, int old_state, int new_state)
static void migrate_fd_cleanup(MigrationState *s)
{
+ MigrationEventType type;
+
g_free(s->hostname);
s->hostname = NULL;
json_writer_free(s->vmdesc);
@@ -1367,7 +1369,9 @@ static void migrate_fd_cleanup(MigrationState *s)
/* It is used on info migrate. We can't free it */
error_report_err(error_copy(s->error));
}
- migration_call_notifiers(s);
+ type = migration_has_failed(s) ? MIG_EVENT_PRECOPY_FAILED :
+ MIG_EVENT_PRECOPY_DONE;
+ migration_call_notifiers(s, type);
block_cleanup_parameters();
yank_unregister_instance(MIGRATION_YANK_INSTANCE);
}
@@ -1474,9 +1478,12 @@ void migration_remove_notifier(NotifierWithReturn *notify)
}
}
-void migration_call_notifiers(MigrationState *s)
+void migration_call_notifiers(MigrationState *s, MigrationEventType type)
{
- notifier_with_return_list_notify(&migration_state_notifiers, s, 0);
+ MigrationEvent e;
+
+ e.type = type;
+ notifier_with_return_list_notify(&migration_state_notifiers, &e, 0);
}
bool migration_in_setup(MigrationState *s)
@@ -2537,7 +2544,7 @@ static int postcopy_start(MigrationState *ms, Error **errp)
* spice needs to trigger a transition now
*/
ms->postcopy_after_devices = true;
- migration_call_notifiers(ms);
+ migration_call_notifiers(ms, MIG_EVENT_PRECOPY_DONE);
migration_downtime_end(ms);
@@ -3601,7 +3608,7 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
rate_limit = migrate_max_bandwidth();
/* Notify before starting migration thread */
- migration_call_notifiers(s);
+ migration_call_notifiers(s, MIG_EVENT_PRECOPY_SETUP);
}
migration_rate_set(rate_limit);
diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 1c00519f10..a29d18a9ef 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -325,13 +325,13 @@ static void vhost_vdpa_net_log_global_enable(VhostVDPAState *s, bool enable)
static int vdpa_net_migration_state_notifier(NotifierWithReturn *notifier,
void *data, Error **errp)
{
- MigrationState *migration = data;
+ MigrationEvent *e = data;
VhostVDPAState *s = container_of(notifier, VhostVDPAState,
migration_state);
- if (migration_in_setup(migration)) {
+ if (e->type == MIG_EVENT_PRECOPY_SETUP) {
vhost_vdpa_net_log_global_enable(s, true);
- } else if (migration_has_failed(migration)) {
+ } else if (e->type == MIG_EVENT_PRECOPY_FAILED) {
vhost_vdpa_net_log_global_enable(s, false);
}
return 0;
diff --git a/ui/spice-core.c b/ui/spice-core.c
index b3cd229023..0a59876da2 100644
--- a/ui/spice-core.c
+++ b/ui/spice-core.c
@@ -571,19 +571,18 @@ static SpiceInfo *qmp_query_spice_real(Error **errp)
static int migration_state_notifier(NotifierWithReturn *notifier,
void *data, Error **errp)
{
- MigrationState *s = data;
+ MigrationEvent *e = data;
if (!spice_have_target_host) {
return 0;
}
- if (migration_in_setup(s)) {
+ if (e->type == MIG_EVENT_PRECOPY_SETUP) {
spice_server_migrate_start(spice_server);
- } else if (migration_has_finished(s) ||
- migration_in_postcopy_after_devices(s)) {
+ } else if (e->type == MIG_EVENT_PRECOPY_DONE) {
spice_server_migrate_end(spice_server, true);
spice_have_target_host = false;
- } else if (migration_has_failed(s)) {
+ } else if (e->type == MIG_EVENT_PRECOPY_FAILED) {
spice_server_migrate_end(spice_server, false);
spice_have_target_host = false;
}
diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events
index 8fdde54456..f0474b244b 100644
--- a/hw/vfio/trace-events
+++ b/hw/vfio/trace-events
@@ -153,7 +153,7 @@ vfio_load_state(const char *name, uint64_t data) " (%s) data 0x%"PRIx64
vfio_load_state_device_data(const char *name, uint64_t data_size, int ret) " (%s) size 0x%"PRIx64" ret %d"
vfio_migration_realize(const char *name) " (%s)"
vfio_migration_set_state(const char *name, const char *state) " (%s) state %s"
-vfio_migration_state_notifier(const char *name, const char *state) " (%s) state %s"
+vfio_migration_state_notifier(const char *name, int state) " (%s) state %d"
vfio_save_block(const char *name, int data_size) " (%s) data_size %d"
vfio_save_cleanup(const char *name) " (%s)"
vfio_save_complete_precopy(const char *name, int ret) " (%s) ret %d"
--
2.43.0
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PULL 15/25] migration: remove postcopy_after_devices
2024-02-28 5:12 [PULL 00/25] Migration next patches peterx
` (13 preceding siblings ...)
2024-02-28 5:13 ` [PULL 14/25] migration: MigrationEvent for notifiers peterx
@ 2024-02-28 5:13 ` peterx
2024-02-28 5:13 ` [PULL 16/25] migration: MigrationNotifyFunc peterx
` (10 subsequent siblings)
25 siblings, 0 replies; 37+ messages in thread
From: peterx @ 2024-02-28 5:13 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: David Hildenbrand, Eric Blake, Laurent Vivier, Paolo Bonzini,
Cédric Le Goater, Gerd Hoffmann, Marc-André Lureau,
Markus Armbruster, Fabiano Rosas, Michael S. Tsirkin, Thomas Huth,
Jason Wang, Alex Williamson, Peter Xu, Steve Sistare
From: Steve Sistare <steven.sistare@oracle.com>
postcopy_after_devices and migration_in_postcopy_after_devices are no
longer used, so delete them.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1708622920-68779-6-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
include/migration/misc.h | 1 -
migration/migration.h | 2 --
migration/migration.c | 7 -------
3 files changed, 10 deletions(-)
diff --git a/include/migration/misc.h b/include/migration/misc.h
index 9e4abae97f..e6150009e0 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -89,7 +89,6 @@ bool migration_in_setup(MigrationState *);
bool migration_has_finished(MigrationState *);
bool migration_has_failed(MigrationState *);
/* ...and after the device transmission */
-bool migration_in_postcopy_after_devices(MigrationState *);
/* True if incoming migration entered POSTCOPY_INCOMING_DISCARD */
bool migration_in_incoming_postcopy(void);
/* True if incoming migration entered POSTCOPY_INCOMING_ADVISE */
diff --git a/migration/migration.h b/migration/migration.h
index f2c8b8f286..aef8afbe1f 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -348,8 +348,6 @@ struct MigrationState {
/* Flag set once the migration has been asked to enter postcopy */
bool start_postcopy;
- /* Flag set after postcopy has sent the device state */
- bool postcopy_after_devices;
/* Flag set once the migration thread is running (and needs joining) */
bool migration_thread_running;
diff --git a/migration/migration.c b/migration/migration.c
index 4650c21f67..8f7f2d92f4 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1527,11 +1527,6 @@ bool migration_postcopy_is_alive(int state)
}
}
-bool migration_in_postcopy_after_devices(MigrationState *s)
-{
- return migration_in_postcopy() && s->postcopy_after_devices;
-}
-
bool migration_in_incoming_postcopy(void)
{
PostcopyState ps = postcopy_state_get();
@@ -1613,7 +1608,6 @@ int migrate_init(MigrationState *s, Error **errp)
s->expected_downtime = 0;
s->setup_time = 0;
s->start_postcopy = false;
- s->postcopy_after_devices = false;
s->migration_thread_running = false;
error_free(s->error);
s->error = NULL;
@@ -2543,7 +2537,6 @@ static int postcopy_start(MigrationState *ms, Error **errp)
* at the transition to postcopy and after the device state; in particular
* spice needs to trigger a transition now
*/
- ms->postcopy_after_devices = true;
migration_call_notifiers(ms, MIG_EVENT_PRECOPY_DONE);
migration_downtime_end(ms);
--
2.43.0
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PULL 16/25] migration: MigrationNotifyFunc
2024-02-28 5:12 [PULL 00/25] Migration next patches peterx
` (14 preceding siblings ...)
2024-02-28 5:13 ` [PULL 15/25] migration: remove postcopy_after_devices peterx
@ 2024-02-28 5:13 ` peterx
2024-02-28 5:13 ` [PULL 17/25] migration: per-mode notifiers peterx
` (9 subsequent siblings)
25 siblings, 0 replies; 37+ messages in thread
From: peterx @ 2024-02-28 5:13 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: David Hildenbrand, Eric Blake, Laurent Vivier, Paolo Bonzini,
Cédric Le Goater, Gerd Hoffmann, Marc-André Lureau,
Markus Armbruster, Fabiano Rosas, Michael S. Tsirkin, Thomas Huth,
Jason Wang, Alex Williamson, Peter Xu, Steve Sistare
From: Steve Sistare <steven.sistare@oracle.com>
Define MigrationNotifyFunc to improve type safety and simplify migration
notifiers.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/1708622920-68779-7-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
include/migration/misc.h | 5 ++++-
hw/net/virtio-net.c | 4 +---
hw/vfio/migration.c | 3 +--
migration/migration.c | 4 ++--
net/vhost-vdpa.c | 6 ++----
ui/spice-core.c | 4 +---
6 files changed, 11 insertions(+), 15 deletions(-)
diff --git a/include/migration/misc.h b/include/migration/misc.h
index e6150009e0..e36a1f3ec4 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -72,6 +72,9 @@ typedef struct MigrationEvent {
MigrationEventType type;
} MigrationEvent;
+typedef int (*MigrationNotifyFunc)(NotifierWithReturn *notify,
+ MigrationEvent *e, Error **errp);
+
/*
* Register the notifier @notify to be called when a migration event occurs
* for MIG_MODE_NORMAL, as specified by the MigrationEvent passed to @func.
@@ -81,7 +84,7 @@ typedef struct MigrationEvent {
* - MIG_EVENT_PRECOPY_FAILED
*/
void migration_add_notifier(NotifierWithReturn *notify,
- NotifierWithReturnFunc func);
+ MigrationNotifyFunc func);
void migration_remove_notifier(NotifierWithReturn *notify);
void migration_call_notifiers(MigrationState *s, MigrationEventType type);
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index e803f98c3a..a3c711b56d 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -3535,10 +3535,8 @@ static void virtio_net_handle_migration_primary(VirtIONet *n, MigrationEvent *e)
}
static int virtio_net_migration_state_notifier(NotifierWithReturn *notifier,
- void *data, Error **errp)
+ MigrationEvent *e, Error **errp)
{
- MigrationEvent *e = data;
-
VirtIONet *n = container_of(notifier, VirtIONet, migration_state);
virtio_net_handle_migration_primary(n, e);
return 0;
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 869d8417d6..50140eda87 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -755,9 +755,8 @@ static void vfio_vmstate_change(void *opaque, bool running, RunState state)
}
static int vfio_migration_state_notifier(NotifierWithReturn *notifier,
- void *data, Error **errp)
+ MigrationEvent *e, Error **errp)
{
- MigrationEvent *e = data;
VFIOMigration *migration = container_of(notifier, VFIOMigration,
migration_state);
VFIODevice *vbasedev = migration->vbasedev;
diff --git a/migration/migration.c b/migration/migration.c
index 8f7f2d92f4..33149c462c 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1464,9 +1464,9 @@ static void migrate_fd_cancel(MigrationState *s)
}
void migration_add_notifier(NotifierWithReturn *notify,
- NotifierWithReturnFunc func)
+ MigrationNotifyFunc func)
{
- notify->notify = func;
+ notify->notify = (NotifierWithReturnFunc)func;
notifier_with_return_list_add(&migration_state_notifiers, notify);
}
diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index a29d18a9ef..e6bdb4562d 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -323,11 +323,9 @@ static void vhost_vdpa_net_log_global_enable(VhostVDPAState *s, bool enable)
}
static int vdpa_net_migration_state_notifier(NotifierWithReturn *notifier,
- void *data, Error **errp)
+ MigrationEvent *e, Error **errp)
{
- MigrationEvent *e = data;
- VhostVDPAState *s = container_of(notifier, VhostVDPAState,
- migration_state);
+ VhostVDPAState *s = container_of(notifier, VhostVDPAState, migration_state);
if (e->type == MIG_EVENT_PRECOPY_SETUP) {
vhost_vdpa_net_log_global_enable(s, true);
diff --git a/ui/spice-core.c b/ui/spice-core.c
index 0a59876da2..15be640286 100644
--- a/ui/spice-core.c
+++ b/ui/spice-core.c
@@ -569,10 +569,8 @@ static SpiceInfo *qmp_query_spice_real(Error **errp)
}
static int migration_state_notifier(NotifierWithReturn *notifier,
- void *data, Error **errp)
+ MigrationEvent *e, Error **errp)
{
- MigrationEvent *e = data;
-
if (!spice_have_target_host) {
return 0;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PULL 17/25] migration: per-mode notifiers
2024-02-28 5:12 [PULL 00/25] Migration next patches peterx
` (15 preceding siblings ...)
2024-02-28 5:13 ` [PULL 16/25] migration: MigrationNotifyFunc peterx
@ 2024-02-28 5:13 ` peterx
2024-02-28 5:13 ` [PULL 18/25] migration: refactor migrate_fd_connect failures peterx
` (8 subsequent siblings)
25 siblings, 0 replies; 37+ messages in thread
From: peterx @ 2024-02-28 5:13 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: David Hildenbrand, Eric Blake, Laurent Vivier, Paolo Bonzini,
Cédric Le Goater, Gerd Hoffmann, Marc-André Lureau,
Markus Armbruster, Fabiano Rosas, Michael S. Tsirkin, Thomas Huth,
Jason Wang, Alex Williamson, Peter Xu, Steve Sistare
From: Steve Sistare <steven.sistare@oracle.com>
Keep a separate list of migration notifiers for each migration mode.
Suggested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/1708622920-68779-8-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
include/migration/misc.h | 6 ++++++
migration/migration.c | 22 +++++++++++++++++-----
2 files changed, 23 insertions(+), 5 deletions(-)
diff --git a/include/migration/misc.h b/include/migration/misc.h
index e36a1f3ec4..4dc06a92b7 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -86,6 +86,12 @@ typedef int (*MigrationNotifyFunc)(NotifierWithReturn *notify,
void migration_add_notifier(NotifierWithReturn *notify,
MigrationNotifyFunc func);
+/*
+ * Same as migration_add_notifier, but applies to be specified @mode.
+ */
+void migration_add_notifier_mode(NotifierWithReturn *notify,
+ MigrationNotifyFunc func, MigMode mode);
+
void migration_remove_notifier(NotifierWithReturn *notify);
void migration_call_notifiers(MigrationState *s, MigrationEventType type);
bool migration_in_setup(MigrationState *);
diff --git a/migration/migration.c b/migration/migration.c
index 33149c462c..925103b61a 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -69,8 +69,13 @@
#include "qemu/sockets.h"
#include "sysemu/kvm.h"
-static NotifierWithReturnList migration_state_notifiers =
- NOTIFIER_WITH_RETURN_LIST_INITIALIZER(migration_state_notifiers);
+#define NOTIFIER_ELEM_INIT(array, elem) \
+ [elem] = NOTIFIER_WITH_RETURN_LIST_INITIALIZER((array)[elem])
+
+static NotifierWithReturnList migration_state_notifiers[] = {
+ NOTIFIER_ELEM_INIT(migration_state_notifiers, MIG_MODE_NORMAL),
+ NOTIFIER_ELEM_INIT(migration_state_notifiers, MIG_MODE_CPR_REBOOT),
+};
/* Messages sent on the return path from destination to source */
enum mig_rp_message_type {
@@ -1463,11 +1468,17 @@ static void migrate_fd_cancel(MigrationState *s)
}
}
+void migration_add_notifier_mode(NotifierWithReturn *notify,
+ MigrationNotifyFunc func, MigMode mode)
+{
+ notify->notify = (NotifierWithReturnFunc)func;
+ notifier_with_return_list_add(&migration_state_notifiers[mode], notify);
+}
+
void migration_add_notifier(NotifierWithReturn *notify,
MigrationNotifyFunc func)
{
- notify->notify = (NotifierWithReturnFunc)func;
- notifier_with_return_list_add(&migration_state_notifiers, notify);
+ migration_add_notifier_mode(notify, func, MIG_MODE_NORMAL);
}
void migration_remove_notifier(NotifierWithReturn *notify)
@@ -1480,10 +1491,11 @@ void migration_remove_notifier(NotifierWithReturn *notify)
void migration_call_notifiers(MigrationState *s, MigrationEventType type)
{
+ MigMode mode = s->parameters.mode;
MigrationEvent e;
e.type = type;
- notifier_with_return_list_notify(&migration_state_notifiers, &e, 0);
+ notifier_with_return_list_notify(&migration_state_notifiers[mode], &e, 0);
}
bool migration_in_setup(MigrationState *s)
--
2.43.0
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PULL 18/25] migration: refactor migrate_fd_connect failures
2024-02-28 5:12 [PULL 00/25] Migration next patches peterx
` (16 preceding siblings ...)
2024-02-28 5:13 ` [PULL 17/25] migration: per-mode notifiers peterx
@ 2024-02-28 5:13 ` peterx
2024-02-28 5:13 ` [PULL 19/25] migration: notifier error checking peterx
` (7 subsequent siblings)
25 siblings, 0 replies; 37+ messages in thread
From: peterx @ 2024-02-28 5:13 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: David Hildenbrand, Eric Blake, Laurent Vivier, Paolo Bonzini,
Cédric Le Goater, Gerd Hoffmann, Marc-André Lureau,
Markus Armbruster, Fabiano Rosas, Michael S. Tsirkin, Thomas Huth,
Jason Wang, Alex Williamson, Peter Xu, Steve Sistare
From: Steve Sistare <steven.sistare@oracle.com>
Move common code for the error path in migrate_fd_connect to a shared
fail label. No functional change.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/1708622920-68779-9-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/migration.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/migration/migration.c b/migration/migration.c
index 925103b61a..6a115d28b8 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -3627,11 +3627,7 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
if (migrate_postcopy_ram() || migrate_return_path()) {
if (open_return_path_on_source(s)) {
error_setg(&local_err, "Unable to open return-path for postcopy");
- migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED);
- migrate_set_error(s, local_err);
- error_report_err(local_err);
- migrate_fd_cleanup(s);
- return;
+ goto fail;
}
}
@@ -3660,6 +3656,13 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
migration_thread, s, QEMU_THREAD_JOINABLE);
}
s->migration_thread_running = true;
+ return;
+
+fail:
+ migrate_set_error(s, local_err);
+ migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED);
+ error_report_err(local_err);
+ migrate_fd_cleanup(s);
}
static void migration_class_init(ObjectClass *klass, void *data)
--
2.43.0
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PULL 19/25] migration: notifier error checking
2024-02-28 5:12 [PULL 00/25] Migration next patches peterx
` (17 preceding siblings ...)
2024-02-28 5:13 ` [PULL 18/25] migration: refactor migrate_fd_connect failures peterx
@ 2024-02-28 5:13 ` peterx
2024-02-28 5:13 ` [PULL 20/25] migration: stop vm for cpr peterx
` (6 subsequent siblings)
25 siblings, 0 replies; 37+ messages in thread
From: peterx @ 2024-02-28 5:13 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: David Hildenbrand, Eric Blake, Laurent Vivier, Paolo Bonzini,
Cédric Le Goater, Gerd Hoffmann, Marc-André Lureau,
Markus Armbruster, Fabiano Rosas, Michael S. Tsirkin, Thomas Huth,
Jason Wang, Alex Williamson, Peter Xu, Steve Sistare
From: Steve Sistare <steven.sistare@oracle.com>
Check the status returned by migration notifiers for event type
MIG_EVENT_PRECOPY_SETUP, and report errors. None of the notifiers
return an error status at this time.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1708622920-68779-10-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
include/migration/misc.h | 8 +++++++-
migration/migration.c | 25 ++++++++++++++++---------
2 files changed, 23 insertions(+), 10 deletions(-)
diff --git a/include/migration/misc.h b/include/migration/misc.h
index 4dc06a92b7..e4933b815b 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -72,6 +72,11 @@ typedef struct MigrationEvent {
MigrationEventType type;
} MigrationEvent;
+/*
+ * A MigrationNotifyFunc may return an error code and an Error object,
+ * but only when @e->type is MIG_EVENT_PRECOPY_SETUP. The code is an int
+ * to allow for different failure modes and recovery actions.
+ */
typedef int (*MigrationNotifyFunc)(NotifierWithReturn *notify,
MigrationEvent *e, Error **errp);
@@ -93,7 +98,8 @@ void migration_add_notifier_mode(NotifierWithReturn *notify,
MigrationNotifyFunc func, MigMode mode);
void migration_remove_notifier(NotifierWithReturn *notify);
-void migration_call_notifiers(MigrationState *s, MigrationEventType type);
+int migration_call_notifiers(MigrationState *s, MigrationEventType type,
+ Error **errp);
bool migration_in_setup(MigrationState *);
bool migration_has_finished(MigrationState *);
bool migration_has_failed(MigrationState *);
diff --git a/migration/migration.c b/migration/migration.c
index 6a115d28b8..37c836b0b0 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1376,7 +1376,7 @@ static void migrate_fd_cleanup(MigrationState *s)
}
type = migration_has_failed(s) ? MIG_EVENT_PRECOPY_FAILED :
MIG_EVENT_PRECOPY_DONE;
- migration_call_notifiers(s, type);
+ migration_call_notifiers(s, type, NULL);
block_cleanup_parameters();
yank_unregister_instance(MIGRATION_YANK_INSTANCE);
}
@@ -1489,13 +1489,18 @@ void migration_remove_notifier(NotifierWithReturn *notify)
}
}
-void migration_call_notifiers(MigrationState *s, MigrationEventType type)
+int migration_call_notifiers(MigrationState *s, MigrationEventType type,
+ Error **errp)
{
MigMode mode = s->parameters.mode;
MigrationEvent e;
+ int ret;
e.type = type;
- notifier_with_return_list_notify(&migration_state_notifiers[mode], &e, 0);
+ ret = notifier_with_return_list_notify(&migration_state_notifiers[mode],
+ &e, errp);
+ assert(!ret || type == MIG_EVENT_PRECOPY_SETUP);
+ return ret;
}
bool migration_in_setup(MigrationState *s)
@@ -2549,7 +2554,7 @@ static int postcopy_start(MigrationState *ms, Error **errp)
* at the transition to postcopy and after the device state; in particular
* spice needs to trigger a transition now
*/
- migration_call_notifiers(ms, MIG_EVENT_PRECOPY_DONE);
+ migration_call_notifiers(ms, MIG_EVENT_PRECOPY_DONE, NULL);
migration_downtime_end(ms);
@@ -2569,11 +2574,10 @@ static int postcopy_start(MigrationState *ms, Error **errp)
ret = qemu_file_get_error(ms->to_dst_file);
if (ret) {
- error_setg(errp, "postcopy_start: Migration stream errored");
- migrate_set_state(&ms->state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
- MIGRATION_STATUS_FAILED);
+ error_setg_errno(errp, -ret, "postcopy_start: Migration stream error");
+ bql_lock();
+ goto fail;
}
-
trace_postcopy_preempt_enabled(migrate_postcopy_preempt());
return ret;
@@ -2594,6 +2598,7 @@ fail:
error_report_err(local_err);
}
}
+ migration_call_notifiers(ms, MIG_EVENT_PRECOPY_FAILED, NULL);
bql_unlock();
return -1;
}
@@ -3613,7 +3618,9 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
rate_limit = migrate_max_bandwidth();
/* Notify before starting migration thread */
- migration_call_notifiers(s, MIG_EVENT_PRECOPY_SETUP);
+ if (migration_call_notifiers(s, MIG_EVENT_PRECOPY_SETUP, &local_err)) {
+ goto fail;
+ }
}
migration_rate_set(rate_limit);
--
2.43.0
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PULL 20/25] migration: stop vm for cpr
2024-02-28 5:12 [PULL 00/25] Migration next patches peterx
` (18 preceding siblings ...)
2024-02-28 5:13 ` [PULL 19/25] migration: notifier error checking peterx
@ 2024-02-28 5:13 ` peterx
2024-05-13 18:22 ` CPR/liveupdate: test results using prior bug fix Michael Galaxy
2024-02-28 5:13 ` [PULL 21/25] migration: update cpr-reboot description peterx
` (5 subsequent siblings)
25 siblings, 1 reply; 37+ messages in thread
From: peterx @ 2024-02-28 5:13 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: David Hildenbrand, Eric Blake, Laurent Vivier, Paolo Bonzini,
Cédric Le Goater, Gerd Hoffmann, Marc-André Lureau,
Markus Armbruster, Fabiano Rosas, Michael S. Tsirkin, Thomas Huth,
Jason Wang, Alex Williamson, Peter Xu, Steve Sistare
From: Steve Sistare <steven.sistare@oracle.com>
When migration for cpr is initiated, stop the vm and set state
RUN_STATE_FINISH_MIGRATE before ram is saved. This eliminates the
possibility of ram and device state being out of sync, and guarantees
that a guest in the suspended state remains suspended, because qmp_cont
rejects a cont command in the RUN_STATE_FINISH_MIGRATE state.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1708622920-68779-11-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
include/migration/misc.h | 1 +
migration/migration.h | 2 --
migration/migration.c | 51 ++++++++++++++++++++++++----------------
3 files changed, 32 insertions(+), 22 deletions(-)
diff --git a/include/migration/misc.h b/include/migration/misc.h
index e4933b815b..5d1aa593ed 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -60,6 +60,7 @@ void migration_object_init(void);
void migration_shutdown(void);
bool migration_is_idle(void);
bool migration_is_active(MigrationState *);
+bool migrate_mode_is_cpr(MigrationState *);
typedef enum MigrationEventType {
MIG_EVENT_PRECOPY_SETUP,
diff --git a/migration/migration.h b/migration/migration.h
index aef8afbe1f..65c0b61cbd 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -541,6 +541,4 @@ int migration_rp_wait(MigrationState *s);
*/
void migration_rp_kick(MigrationState *s);
-int migration_stop_vm(RunState state);
-
#endif
diff --git a/migration/migration.c b/migration/migration.c
index 37c836b0b0..90a90947fb 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -167,11 +167,19 @@ static gint page_request_addr_cmp(gconstpointer ap, gconstpointer bp)
return (a > b) - (a < b);
}
-int migration_stop_vm(RunState state)
+static int migration_stop_vm(MigrationState *s, RunState state)
{
- int ret = vm_stop_force_state(state);
+ int ret;
+
+ migration_downtime_start(s);
+
+ s->vm_old_state = runstate_get();
+ global_state_store();
+
+ ret = vm_stop_force_state(state);
trace_vmstate_downtime_checkpoint("src-vm-stopped");
+ trace_migration_completion_vm_stop(ret);
return ret;
}
@@ -1602,6 +1610,11 @@ bool migration_is_active(MigrationState *s)
s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE);
}
+bool migrate_mode_is_cpr(MigrationState *s)
+{
+ return s->parameters.mode == MIG_MODE_CPR_REBOOT;
+}
+
int migrate_init(MigrationState *s, Error **errp)
{
int ret;
@@ -2454,10 +2467,7 @@ static int postcopy_start(MigrationState *ms, Error **errp)
bql_lock();
trace_postcopy_start_set_run();
- migration_downtime_start(ms);
-
- global_state_store();
- ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
+ ret = migration_stop_vm(ms, RUN_STATE_FINISH_MIGRATE);
if (ret < 0) {
goto fail;
}
@@ -2652,15 +2662,12 @@ static int migration_completion_precopy(MigrationState *s,
int ret;
bql_lock();
- migration_downtime_start(s);
-
- s->vm_old_state = runstate_get();
- global_state_store();
- ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
- trace_migration_completion_vm_stop(ret);
- if (ret < 0) {
- goto out_unlock;
+ if (!migrate_mode_is_cpr(s)) {
+ ret = migration_stop_vm(s, RUN_STATE_FINISH_MIGRATE);
+ if (ret < 0) {
+ goto out_unlock;
+ }
}
ret = migration_maybe_pause(s, current_active_state,
@@ -3500,15 +3507,10 @@ static void *bg_migration_thread(void *opaque)
s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_HOST) - setup_start;
trace_migration_thread_setup_complete();
- migration_downtime_start(s);
bql_lock();
- s->vm_old_state = runstate_get();
-
- global_state_store();
- /* Forcibly stop VM before saving state of vCPUs and devices */
- if (migration_stop_vm(RUN_STATE_PAUSED)) {
+ if (migration_stop_vm(s, RUN_STATE_PAUSED)) {
goto fail;
}
/*
@@ -3584,6 +3586,7 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
Error *local_err = NULL;
uint64_t rate_limit;
bool resume = s->state == MIGRATION_STATUS_POSTCOPY_PAUSED;
+ int ret;
/*
* If there's a previous error, free it and prepare for another one.
@@ -3655,6 +3658,14 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
return;
}
+ if (migrate_mode_is_cpr(s)) {
+ ret = migration_stop_vm(s, RUN_STATE_FINISH_MIGRATE);
+ if (ret < 0) {
+ error_setg(&local_err, "migration_stop_vm failed, error %d", -ret);
+ goto fail;
+ }
+ }
+
if (migrate_background_snapshot()) {
qemu_thread_create(&s->thread, "bg_snapshot",
bg_migration_thread, s, QEMU_THREAD_JOINABLE);
--
2.43.0
^ permalink raw reply related [flat|nested] 37+ messages in thread
* CPR/liveupdate: test results using prior bug fix
2024-02-28 5:13 ` [PULL 20/25] migration: stop vm for cpr peterx
@ 2024-05-13 18:22 ` Michael Galaxy
2024-05-13 20:10 ` Steven Sistare
0 siblings, 1 reply; 37+ messages in thread
From: Michael Galaxy @ 2024-05-13 18:22 UTC (permalink / raw)
To: peterx, Peter Maydell, qemu-devel
Cc: David Hildenbrand, Eric Blake, Laurent Vivier, Paolo Bonzini,
Cédric Le Goater, Gerd Hoffmann, Marc-André Lureau,
Markus Armbruster, Fabiano Rosas, Michael S. Tsirkin, Thomas Huth,
Jason Wang, Alex Williamson, Steve Sistare, Hunt, Joshua
[-- Attachment #1: Type: text/plain, Size: 7219 bytes --]
Hi Steve,
We found that this specific change in particular ("migration: stop vm
for cpr") fixes a bug that we've identified in testing back-to-back live
updates in a lab environment.
More specifically, *without* this change (which is not available in
8.2.2, but *is* available in 9.0.0) causes the metadata save file to be
corrupted when doing live updates one after another. Typically we see a
corrupted save file somewhere in between 20 and 30 live updates and
while doing a git bisect, we found that this change makes the problem go
away.
Were you aware? Is there any plan in place to cherry pick this for
8.2.3, perhaps or a plan to release 8.2.3 at some point?
Here are some examples of how the bug manifests in different locations
of the QEMU metadata save file:
2024-04-26T13:28:53Z qemu-system-x86_64: Failed to load mtrr_var:base
2024-04-26T13:28:53Z qemu-system-x86_64: Failed to load cpu:env.mtrr_var
2024-04-26T13:28:53Z qemu-system-x86_64: error while loading state for instance 0x1b of device 'cpu'
2024-04-26T13:28:53Z qemu-system-x86_64: load of migration failed: Input/output error
And another:
2024-04-17T16:09:47Z qemu-system-x86_64: check_section_footer: Read section footer failed: -5
2024-04-17T16:09:47Z qemu-system-x86_64: load of migration failed: Invalid argument
And another:
2024-04-30T21:53:29Z qemu-system-x86_64: Unable to read ID string for section 163
2024-04-30T21:53:29Z qemu-system-x86_64: load of migration failed: Invalid argument
And another:
2024-05-01T16:01:44Z qemu-system-x86_64: Unable to read ID string for section 164
2024-05-01T16:01:44Z qemu-system-x86_64: load of migration failed: Invalid argument
As you can see, they occur quite randomly, but generally it takes at
least 20-30+ live updates before the problem occurs.
- Michael
On 2/27/24 23:13, peterx@redhat.com wrote:
> From: Steve Sistare<steven.sistare@oracle.com>
>
> When migration for cpr is initiated, stop the vm and set state
> RUN_STATE_FINISH_MIGRATE before ram is saved. This eliminates the
> possibility of ram and device state being out of sync, and guarantees
> that a guest in the suspended state remains suspended, because qmp_cont
> rejects a cont command in the RUN_STATE_FINISH_MIGRATE state.
>
> Signed-off-by: Steve Sistare<steven.sistare@oracle.com>
> Reviewed-by: Peter Xu<peterx@redhat.com>
> Link:https://urldefense.com/v3/__https://lore.kernel.org/r/1708622920-68779-11-git-send-email-steven.sistare@oracle.com__;!!GjvTz_vk!QLsFOCX-x2U9bzAo98SdidKlomHrmf_t0UmQKtgudoIcaDVoAJOPm39ZqaNP_nT5I8QqVfSgwhDZmg$
> Signed-off-by: Peter Xu<peterx@redhat.com>
> ---
> include/migration/misc.h | 1 +
> migration/migration.h | 2 --
> migration/migration.c | 51 ++++++++++++++++++++++++----------------
> 3 files changed, 32 insertions(+), 22 deletions(-)
>
> diff --git a/include/migration/misc.h b/include/migration/misc.h
> index e4933b815b..5d1aa593ed 100644
> --- a/include/migration/misc.h
> +++ b/include/migration/misc.h
> @@ -60,6 +60,7 @@ void migration_object_init(void);
> void migration_shutdown(void);
> bool migration_is_idle(void);
> bool migration_is_active(MigrationState *);
> +bool migrate_mode_is_cpr(MigrationState *);
>
> typedef enum MigrationEventType {
> MIG_EVENT_PRECOPY_SETUP,
> diff --git a/migration/migration.h b/migration/migration.h
> index aef8afbe1f..65c0b61cbd 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -541,6 +541,4 @@ int migration_rp_wait(MigrationState *s);
> */
> void migration_rp_kick(MigrationState *s);
>
> -int migration_stop_vm(RunState state);
> -
> #endif
> diff --git a/migration/migration.c b/migration/migration.c
> index 37c836b0b0..90a90947fb 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -167,11 +167,19 @@ static gint page_request_addr_cmp(gconstpointer ap, gconstpointer bp)
> return (a > b) - (a < b);
> }
>
> -int migration_stop_vm(RunState state)
> +static int migration_stop_vm(MigrationState *s, RunState state)
> {
> - int ret = vm_stop_force_state(state);
> + int ret;
> +
> + migration_downtime_start(s);
> +
> + s->vm_old_state = runstate_get();
> + global_state_store();
> +
> + ret = vm_stop_force_state(state);
>
> trace_vmstate_downtime_checkpoint("src-vm-stopped");
> + trace_migration_completion_vm_stop(ret);
>
> return ret;
> }
> @@ -1602,6 +1610,11 @@ bool migration_is_active(MigrationState *s)
> s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE);
> }
>
> +bool migrate_mode_is_cpr(MigrationState *s)
> +{
> + return s->parameters.mode == MIG_MODE_CPR_REBOOT;
> +}
> +
> int migrate_init(MigrationState *s, Error **errp)
> {
> int ret;
> @@ -2454,10 +2467,7 @@ static int postcopy_start(MigrationState *ms, Error **errp)
> bql_lock();
> trace_postcopy_start_set_run();
>
> - migration_downtime_start(ms);
> -
> - global_state_store();
> - ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
> + ret = migration_stop_vm(ms, RUN_STATE_FINISH_MIGRATE);
> if (ret < 0) {
> goto fail;
> }
> @@ -2652,15 +2662,12 @@ static int migration_completion_precopy(MigrationState *s,
> int ret;
>
> bql_lock();
> - migration_downtime_start(s);
> -
> - s->vm_old_state = runstate_get();
> - global_state_store();
>
> - ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
> - trace_migration_completion_vm_stop(ret);
> - if (ret < 0) {
> - goto out_unlock;
> + if (!migrate_mode_is_cpr(s)) {
> + ret = migration_stop_vm(s, RUN_STATE_FINISH_MIGRATE);
> + if (ret < 0) {
> + goto out_unlock;
> + }
> }
>
> ret = migration_maybe_pause(s, current_active_state,
> @@ -3500,15 +3507,10 @@ static void *bg_migration_thread(void *opaque)
> s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_HOST) - setup_start;
>
> trace_migration_thread_setup_complete();
> - migration_downtime_start(s);
>
> bql_lock();
>
> - s->vm_old_state = runstate_get();
> -
> - global_state_store();
> - /* Forcibly stop VM before saving state of vCPUs and devices */
> - if (migration_stop_vm(RUN_STATE_PAUSED)) {
> + if (migration_stop_vm(s, RUN_STATE_PAUSED)) {
> goto fail;
> }
> /*
> @@ -3584,6 +3586,7 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
> Error *local_err = NULL;
> uint64_t rate_limit;
> bool resume = s->state == MIGRATION_STATUS_POSTCOPY_PAUSED;
> + int ret;
>
> /*
> * If there's a previous error, free it and prepare for another one.
> @@ -3655,6 +3658,14 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
> return;
> }
>
> + if (migrate_mode_is_cpr(s)) {
> + ret = migration_stop_vm(s, RUN_STATE_FINISH_MIGRATE);
> + if (ret < 0) {
> + error_setg(&local_err, "migration_stop_vm failed, error %d", -ret);
> + goto fail;
> + }
> + }
> +
> if (migrate_background_snapshot()) {
> qemu_thread_create(&s->thread, "bg_snapshot",
> bg_migration_thread, s, QEMU_THREAD_JOINABLE);
[-- Attachment #2: Type: text/html, Size: 8794 bytes --]
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: CPR/liveupdate: test results using prior bug fix
2024-05-13 18:22 ` CPR/liveupdate: test results using prior bug fix Michael Galaxy
@ 2024-05-13 20:10 ` Steven Sistare
2024-05-14 1:15 ` Michael Galaxy
0 siblings, 1 reply; 37+ messages in thread
From: Steven Sistare @ 2024-05-13 20:10 UTC (permalink / raw)
To: Michael Galaxy, peterx, Peter Maydell, qemu-devel
Cc: David Hildenbrand, Eric Blake, Laurent Vivier, Paolo Bonzini,
Cédric Le Goater, Gerd Hoffmann, Marc-André Lureau,
Markus Armbruster, Fabiano Rosas, Michael S. Tsirkin, Thomas Huth,
Jason Wang, Alex Williamson, Hunt, Joshua
Hi Michael,
No surprise here, I did see some of the same failure messages and they
prompted me to submit the fix. They are all symptoms of "the possibility of
ram and device state being out of sync" as mentioned in the commit.
I am not familiar with the process for maintaining old releases for qemu.
Perhaps someone on this list can comment on 8.2.3.
- Steve
On 5/13/2024 2:22 PM, Michael Galaxy wrote:
> Hi Steve,
>
> We found that this specific change in particular ("migration: stop vm for cpr")
> fixes a bug that we've identified in testing back-to-back live updates in a lab
> environment.
>
> More specifically, *without* this change (which is not available in 8.2.2, but
> *is* available in 9.0.0) causes the metadata save file to be corrupted when
> doing live updates one after another. Typically we see a corrupted save file
> somewhere in between 20 and 30 live updates and while doing a git bisect, we
> found that this change makes the problem go away.
>
> Were you aware? Is there any plan in place to cherry pick this for 8.2.3,
> perhaps or a plan to release 8.2.3 at some point?
>
> Here are some examples of how the bug manifests in different locations of the
> QEMU metadata save file:
>
> 2024-04-26T13:28:53Z qemu-system-x86_64: Failed to load mtrr_var:base
> 2024-04-26T13:28:53Z qemu-system-x86_64: Failed to load cpu:env.mtrr_var
> 2024-04-26T13:28:53Z qemu-system-x86_64: error while loading state for instance 0x1b of device 'cpu'
> 2024-04-26T13:28:53Z qemu-system-x86_64: load of migration failed: Input/output error
>
> And another:
>
> 2024-04-17T16:09:47Z qemu-system-x86_64: check_section_footer: Read section footer failed: -5
> 2024-04-17T16:09:47Z qemu-system-x86_64: load of migration failed: Invalid argument
>
> And another:
>
> 2024-04-30T21:53:29Z qemu-system-x86_64: Unable to read ID string for section 163
> 2024-04-30T21:53:29Z qemu-system-x86_64: load of migration failed: Invalid argument
>
> And another:
>
> 2024-05-01T16:01:44Z qemu-system-x86_64: Unable to read ID string for section 164
> 2024-05-01T16:01:44Z qemu-system-x86_64: load of migration failed: Invalid argument
>
>
> As you can see, they occur quite randomly, but generally it takes at least
> 20-30+ live updates before the problem occurs.
>
> - Michael
>
> On 2/27/24 23:13, peterx@redhat.com wrote:
>> From: Steve Sistare<steven.sistare@oracle.com>
>>
>> When migration for cpr is initiated, stop the vm and set state
>> RUN_STATE_FINISH_MIGRATE before ram is saved. This eliminates the
>> possibility of ram and device state being out of sync, and guarantees
>> that a guest in the suspended state remains suspended, because qmp_cont
>> rejects a cont command in the RUN_STATE_FINISH_MIGRATE state.
>>
>> Signed-off-by: Steve Sistare<steven.sistare@oracle.com>
>> Reviewed-by: Peter Xu<peterx@redhat.com>
>> Link:https://urldefense.com/v3/__https://lore.kernel.org/r/1708622920-68779-11-git-send-email-steven.sistare@oracle.com__;!!GjvTz_vk!QLsFOCX-x2U9bzAo98SdidKlomHrmf_t0UmQKtgudoIcaDVoAJOPm39ZqaNP_nT5I8QqVfSgwhDZmg$
>> Signed-off-by: Peter Xu<peterx@redhat.com>
>> ---
>> include/migration/misc.h | 1 +
>> migration/migration.h | 2 --
>> migration/migration.c | 51 ++++++++++++++++++++++++----------------
>> 3 files changed, 32 insertions(+), 22 deletions(-)
>>
>> diff --git a/include/migration/misc.h b/include/migration/misc.h
>> index e4933b815b..5d1aa593ed 100644
>> --- a/include/migration/misc.h
>> +++ b/include/migration/misc.h
>> @@ -60,6 +60,7 @@ void migration_object_init(void);
>> void migration_shutdown(void);
>> bool migration_is_idle(void);
>> bool migration_is_active(MigrationState *);
>> +bool migrate_mode_is_cpr(MigrationState *);
>>
>> typedef enum MigrationEventType {
>> MIG_EVENT_PRECOPY_SETUP,
>> diff --git a/migration/migration.h b/migration/migration.h
>> index aef8afbe1f..65c0b61cbd 100644
>> --- a/migration/migration.h
>> +++ b/migration/migration.h
>> @@ -541,6 +541,4 @@ int migration_rp_wait(MigrationState *s);
>> */
>> void migration_rp_kick(MigrationState *s);
>>
>> -int migration_stop_vm(RunState state);
>> -
>> #endif
>> diff --git a/migration/migration.c b/migration/migration.c
>> index 37c836b0b0..90a90947fb 100644
>> --- a/migration/migration.c
>> +++ b/migration/migration.c
>> @@ -167,11 +167,19 @@ static gint page_request_addr_cmp(gconstpointer ap, gconstpointer bp)
>> return (a > b) - (a < b);
>> }
>>
>> -int migration_stop_vm(RunState state)
>> +static int migration_stop_vm(MigrationState *s, RunState state)
>> {
>> - int ret = vm_stop_force_state(state);
>> + int ret;
>> +
>> + migration_downtime_start(s);
>> +
>> + s->vm_old_state = runstate_get();
>> + global_state_store();
>> +
>> + ret = vm_stop_force_state(state);
>>
>> trace_vmstate_downtime_checkpoint("src-vm-stopped");
>> + trace_migration_completion_vm_stop(ret);
>>
>> return ret;
>> }
>> @@ -1602,6 +1610,11 @@ bool migration_is_active(MigrationState *s)
>> s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE);
>> }
>>
>> +bool migrate_mode_is_cpr(MigrationState *s)
>> +{
>> + return s->parameters.mode == MIG_MODE_CPR_REBOOT;
>> +}
>> +
>> int migrate_init(MigrationState *s, Error **errp)
>> {
>> int ret;
>> @@ -2454,10 +2467,7 @@ static int postcopy_start(MigrationState *ms, Error **errp)
>> bql_lock();
>> trace_postcopy_start_set_run();
>>
>> - migration_downtime_start(ms);
>> -
>> - global_state_store();
>> - ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
>> + ret = migration_stop_vm(ms, RUN_STATE_FINISH_MIGRATE);
>> if (ret < 0) {
>> goto fail;
>> }
>> @@ -2652,15 +2662,12 @@ static int migration_completion_precopy(MigrationState *s,
>> int ret;
>>
>> bql_lock();
>> - migration_downtime_start(s);
>> -
>> - s->vm_old_state = runstate_get();
>> - global_state_store();
>>
>> - ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
>> - trace_migration_completion_vm_stop(ret);
>> - if (ret < 0) {
>> - goto out_unlock;
>> + if (!migrate_mode_is_cpr(s)) {
>> + ret = migration_stop_vm(s, RUN_STATE_FINISH_MIGRATE);
>> + if (ret < 0) {
>> + goto out_unlock;
>> + }
>> }
>>
>> ret = migration_maybe_pause(s, current_active_state,
>> @@ -3500,15 +3507,10 @@ static void *bg_migration_thread(void *opaque)
>> s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_HOST) - setup_start;
>>
>> trace_migration_thread_setup_complete();
>> - migration_downtime_start(s);
>>
>> bql_lock();
>>
>> - s->vm_old_state = runstate_get();
>> -
>> - global_state_store();
>> - /* Forcibly stop VM before saving state of vCPUs and devices */
>> - if (migration_stop_vm(RUN_STATE_PAUSED)) {
>> + if (migration_stop_vm(s, RUN_STATE_PAUSED)) {
>> goto fail;
>> }
>> /*
>> @@ -3584,6 +3586,7 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
>> Error *local_err = NULL;
>> uint64_t rate_limit;
>> bool resume = s->state == MIGRATION_STATUS_POSTCOPY_PAUSED;
>> + int ret;
>>
>> /*
>> * If there's a previous error, free it and prepare for another one.
>> @@ -3655,6 +3658,14 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
>> return;
>> }
>>
>> + if (migrate_mode_is_cpr(s)) {
>> + ret = migration_stop_vm(s, RUN_STATE_FINISH_MIGRATE);
>> + if (ret < 0) {
>> + error_setg(&local_err, "migration_stop_vm failed, error %d", -ret);
>> + goto fail;
>> + }
>> + }
>> +
>> if (migrate_background_snapshot()) {
>> qemu_thread_create(&s->thread, "bg_snapshot",
>> bg_migration_thread, s, QEMU_THREAD_JOINABLE);
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: CPR/liveupdate: test results using prior bug fix
2024-05-13 20:10 ` Steven Sistare
@ 2024-05-14 1:15 ` Michael Galaxy
2024-05-14 13:39 ` Michael Galaxy
0 siblings, 1 reply; 37+ messages in thread
From: Michael Galaxy @ 2024-05-14 1:15 UTC (permalink / raw)
To: Steven Sistare, peterx, Peter Maydell, qemu-devel
Cc: David Hildenbrand, Eric Blake, Laurent Vivier, Paolo Bonzini,
Cédric Le Goater, Gerd Hoffmann, Marc-André Lureau,
Markus Armbruster, Fabiano Rosas, Michael S. Tsirkin, Thomas Huth,
Jason Wang, Alex Williamson, Hunt, Joshua
Hi Steve,
Thanks for the response.
It looks like literally *just today* 8.2.4 was released. I'll go check
it out.
- Michael
On 5/13/24 15:10, Steven Sistare wrote:
> Hi Michael,
> No surprise here, I did see some of the same failure messages and they
> prompted me to submit the fix. They are all symptoms of "the
> possibility of
> ram and device state being out of sync" as mentioned in the commit.
>
> I am not familiar with the process for maintaining old releases for qemu.
> Perhaps someone on this list can comment on 8.2.3.
>
> - Steve
>
> On 5/13/2024 2:22 PM, Michael Galaxy wrote:
>> Hi Steve,
>>
>> We found that this specific change in particular ("migration: stop vm
>> for cpr") fixes a bug that we've identified in testing back-to-back
>> live updates in a lab environment.
>>
>> More specifically, *without* this change (which is not available in
>> 8.2.2, but *is* available in 9.0.0) causes the metadata save file to
>> be corrupted when doing live updates one after another. Typically we
>> see a corrupted save file somewhere in between 20 and 30 live updates
>> and while doing a git bisect, we found that this change makes the
>> problem go away.
>>
>> Were you aware? Is there any plan in place to cherry pick this for
>> 8.2.3, perhaps or a plan to release 8.2.3 at some point?
>>
>> Here are some examples of how the bug manifests in different
>> locations of the QEMU metadata save file:
>>
>> 2024-04-26T13:28:53Z qemu-system-x86_64: Failed to load mtrr_var:base
>> 2024-04-26T13:28:53Z qemu-system-x86_64: Failed to load cpu:env.mtrr_var
>> 2024-04-26T13:28:53Z qemu-system-x86_64: error while loading state
>> for instance 0x1b of device 'cpu'
>> 2024-04-26T13:28:53Z qemu-system-x86_64: load of migration failed:
>> Input/output error
>>
>> And another:
>>
>> 2024-04-17T16:09:47Z qemu-system-x86_64: check_section_footer: Read
>> section footer failed: -5
>> 2024-04-17T16:09:47Z qemu-system-x86_64: load of migration failed:
>> Invalid argument
>>
>> And another:
>>
>> 2024-04-30T21:53:29Z qemu-system-x86_64: Unable to read ID string for
>> section 163
>> 2024-04-30T21:53:29Z qemu-system-x86_64: load of migration failed:
>> Invalid argument
>>
>> And another:
>>
>> 2024-05-01T16:01:44Z qemu-system-x86_64: Unable to read ID string for
>> section 164
>> 2024-05-01T16:01:44Z qemu-system-x86_64: load of migration failed:
>> Invalid argument
>>
>> As you can see, they occur quite randomly, but generally it takes at
>> least 20-30+ live updates before the problem occurs.
>>
>> - Michael
>>
>> On 2/27/24 23:13, peterx@redhat.com wrote:
>>> From: Steve Sistare<steven.sistare@oracle.com>
>>>
>>> When migration for cpr is initiated, stop the vm and set state
>>> RUN_STATE_FINISH_MIGRATE before ram is saved. This eliminates the
>>> possibility of ram and device state being out of sync, and guarantees
>>> that a guest in the suspended state remains suspended, because qmp_cont
>>> rejects a cont command in the RUN_STATE_FINISH_MIGRATE state.
>>>
>>> Signed-off-by: Steve Sistare<steven.sistare@oracle.com>
>>> Reviewed-by: Peter Xu<peterx@redhat.com>
>>> Link:https://urldefense.com/v3/__https://lore.kernel.org/r/1708622920-68779-11-git-send-email-steven.sistare@oracle.com__;!!GjvTz_vk!QLsFOCX-x2U9bzAo98SdidKlomHrmf_t0UmQKtgudoIcaDVoAJOPm39ZqaNP_nT5I8QqVfSgwhDZmg$
>>> Signed-off-by: Peter Xu<peterx@redhat.com>
>>> ---
>>> include/migration/misc.h | 1 +
>>> migration/migration.h | 2 --
>>> migration/migration.c | 51
>>> ++++++++++++++++++++++++----------------
>>> 3 files changed, 32 insertions(+), 22 deletions(-)
>>>
>>> diff --git a/include/migration/misc.h b/include/migration/misc.h
>>> index e4933b815b..5d1aa593ed 100644
>>> --- a/include/migration/misc.h
>>> +++ b/include/migration/misc.h
>>> @@ -60,6 +60,7 @@ void migration_object_init(void);
>>> void migration_shutdown(void);
>>> bool migration_is_idle(void);
>>> bool migration_is_active(MigrationState *);
>>> +bool migrate_mode_is_cpr(MigrationState *);
>>> typedef enum MigrationEventType {
>>> MIG_EVENT_PRECOPY_SETUP,
>>> diff --git a/migration/migration.h b/migration/migration.h
>>> index aef8afbe1f..65c0b61cbd 100644
>>> --- a/migration/migration.h
>>> +++ b/migration/migration.h
>>> @@ -541,6 +541,4 @@ int migration_rp_wait(MigrationState *s);
>>> */
>>> void migration_rp_kick(MigrationState *s);
>>> -int migration_stop_vm(RunState state);
>>> -
>>> #endif
>>> diff --git a/migration/migration.c b/migration/migration.c
>>> index 37c836b0b0..90a90947fb 100644
>>> --- a/migration/migration.c
>>> +++ b/migration/migration.c
>>> @@ -167,11 +167,19 @@ static gint
>>> page_request_addr_cmp(gconstpointer ap, gconstpointer bp)
>>> return (a > b) - (a < b);
>>> }
>>> -int migration_stop_vm(RunState state)
>>> +static int migration_stop_vm(MigrationState *s, RunState state)
>>> {
>>> - int ret = vm_stop_force_state(state);
>>> + int ret;
>>> +
>>> + migration_downtime_start(s);
>>> +
>>> + s->vm_old_state = runstate_get();
>>> + global_state_store();
>>> +
>>> + ret = vm_stop_force_state(state);
>>> trace_vmstate_downtime_checkpoint("src-vm-stopped");
>>> + trace_migration_completion_vm_stop(ret);
>>> return ret;
>>> }
>>> @@ -1602,6 +1610,11 @@ bool migration_is_active(MigrationState *s)
>>> s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE);
>>> }
>>> +bool migrate_mode_is_cpr(MigrationState *s)
>>> +{
>>> + return s->parameters.mode == MIG_MODE_CPR_REBOOT;
>>> +}
>>> +
>>> int migrate_init(MigrationState *s, Error **errp)
>>> {
>>> int ret;
>>> @@ -2454,10 +2467,7 @@ static int postcopy_start(MigrationState *ms,
>>> Error **errp)
>>> bql_lock();
>>> trace_postcopy_start_set_run();
>>> - migration_downtime_start(ms);
>>> -
>>> - global_state_store();
>>> - ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
>>> + ret = migration_stop_vm(ms, RUN_STATE_FINISH_MIGRATE);
>>> if (ret < 0) {
>>> goto fail;
>>> }
>>> @@ -2652,15 +2662,12 @@ static int
>>> migration_completion_precopy(MigrationState *s,
>>> int ret;
>>> bql_lock();
>>> - migration_downtime_start(s);
>>> -
>>> - s->vm_old_state = runstate_get();
>>> - global_state_store();
>>> - ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
>>> - trace_migration_completion_vm_stop(ret);
>>> - if (ret < 0) {
>>> - goto out_unlock;
>>> + if (!migrate_mode_is_cpr(s)) {
>>> + ret = migration_stop_vm(s, RUN_STATE_FINISH_MIGRATE);
>>> + if (ret < 0) {
>>> + goto out_unlock;
>>> + }
>>> }
>>> ret = migration_maybe_pause(s, current_active_state,
>>> @@ -3500,15 +3507,10 @@ static void *bg_migration_thread(void *opaque)
>>> s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_HOST) - setup_start;
>>> trace_migration_thread_setup_complete();
>>> - migration_downtime_start(s);
>>> bql_lock();
>>> - s->vm_old_state = runstate_get();
>>> -
>>> - global_state_store();
>>> - /* Forcibly stop VM before saving state of vCPUs and devices */
>>> - if (migration_stop_vm(RUN_STATE_PAUSED)) {
>>> + if (migration_stop_vm(s, RUN_STATE_PAUSED)) {
>>> goto fail;
>>> }
>>> /*
>>> @@ -3584,6 +3586,7 @@ void migrate_fd_connect(MigrationState *s,
>>> Error *error_in)
>>> Error *local_err = NULL;
>>> uint64_t rate_limit;
>>> bool resume = s->state == MIGRATION_STATUS_POSTCOPY_PAUSED;
>>> + int ret;
>>> /*
>>> * If there's a previous error, free it and prepare for
>>> another one.
>>> @@ -3655,6 +3658,14 @@ void migrate_fd_connect(MigrationState *s,
>>> Error *error_in)
>>> return;
>>> }
>>> + if (migrate_mode_is_cpr(s)) {
>>> + ret = migration_stop_vm(s, RUN_STATE_FINISH_MIGRATE);
>>> + if (ret < 0) {
>>> + error_setg(&local_err, "migration_stop_vm failed, error
>>> %d", -ret);
>>> + goto fail;
>>> + }
>>> + }
>>> +
>>> if (migrate_background_snapshot()) {
>>> qemu_thread_create(&s->thread, "bg_snapshot",
>>> bg_migration_thread, s, QEMU_THREAD_JOINABLE);
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: CPR/liveupdate: test results using prior bug fix
2024-05-14 1:15 ` Michael Galaxy
@ 2024-05-14 13:39 ` Michael Galaxy
2024-05-14 13:54 ` Michael Tokarev
0 siblings, 1 reply; 37+ messages in thread
From: Michael Galaxy @ 2024-05-14 13:39 UTC (permalink / raw)
To: Steven Sistare, peterx, Peter Maydell, qemu-devel
Cc: David Hildenbrand, Eric Blake, Laurent Vivier, Paolo Bonzini,
Cédric Le Goater, Gerd Hoffmann, Marc-André Lureau,
Markus Armbruster, Fabiano Rosas, Michael S. Tsirkin, Thomas Huth,
Jason Wang, Alex Williamson, Hunt, Joshua
Steve,
OK, so it does not look like this bugfix you wrote was included in 8.2.4
(which was released yesterday). Unfortunately, that means that anyone
using CPR in that release will still (eventually) encounter the bug like
I did.
I would recommend that y'all consider cherry-picking, perhaps, the
relevant commits for a possible 8.2.5 ?
- Michael
On 5/13/24 20:15, Michael Galaxy wrote:
> Hi Steve,
>
> Thanks for the response.
>
> It looks like literally *just today* 8.2.4 was released. I'll go check
> it out.
>
> - Michael
>
> On 5/13/24 15:10, Steven Sistare wrote:
>> Hi Michael,
>> No surprise here, I did see some of the same failure messages and they
>> prompted me to submit the fix. They are all symptoms of "the
>> possibility of
>> ram and device state being out of sync" as mentioned in the commit.
>>
>> I am not familiar with the process for maintaining old releases for
>> qemu.
>> Perhaps someone on this list can comment on 8.2.3.
>>
>> - Steve
>>
>> On 5/13/2024 2:22 PM, Michael Galaxy wrote:
>>> Hi Steve,
>>>
>>> We found that this specific change in particular ("migration: stop
>>> vm for cpr") fixes a bug that we've identified in testing
>>> back-to-back live updates in a lab environment.
>>>
>>> More specifically, *without* this change (which is not available in
>>> 8.2.2, but *is* available in 9.0.0) causes the metadata save file to
>>> be corrupted when doing live updates one after another. Typically we
>>> see a corrupted save file somewhere in between 20 and 30 live
>>> updates and while doing a git bisect, we found that this change
>>> makes the problem go away.
>>>
>>> Were you aware? Is there any plan in place to cherry pick this for
>>> 8.2.3, perhaps or a plan to release 8.2.3 at some point?
>>>
>>> Here are some examples of how the bug manifests in different
>>> locations of the QEMU metadata save file:
>>>
>>> 2024-04-26T13:28:53Z qemu-system-x86_64: Failed to load mtrr_var:base
>>> 2024-04-26T13:28:53Z qemu-system-x86_64: Failed to load
>>> cpu:env.mtrr_var
>>> 2024-04-26T13:28:53Z qemu-system-x86_64: error while loading state
>>> for instance 0x1b of device 'cpu'
>>> 2024-04-26T13:28:53Z qemu-system-x86_64: load of migration failed:
>>> Input/output error
>>>
>>> And another:
>>>
>>> 2024-04-17T16:09:47Z qemu-system-x86_64: check_section_footer: Read
>>> section footer failed: -5
>>> 2024-04-17T16:09:47Z qemu-system-x86_64: load of migration failed:
>>> Invalid argument
>>>
>>> And another:
>>>
>>> 2024-04-30T21:53:29Z qemu-system-x86_64: Unable to read ID string
>>> for section 163
>>> 2024-04-30T21:53:29Z qemu-system-x86_64: load of migration failed:
>>> Invalid argument
>>>
>>> And another:
>>>
>>> 2024-05-01T16:01:44Z qemu-system-x86_64: Unable to read ID string
>>> for section 164
>>> 2024-05-01T16:01:44Z qemu-system-x86_64: load of migration failed:
>>> Invalid argument
>>>
>>> As you can see, they occur quite randomly, but generally it takes at
>>> least 20-30+ live updates before the problem occurs.
>>>
>>> - Michael
>>>
>>> On 2/27/24 23:13, peterx@redhat.com wrote:
>>>> From: Steve Sistare<steven.sistare@oracle.com>
>>>>
>>>> When migration for cpr is initiated, stop the vm and set state
>>>> RUN_STATE_FINISH_MIGRATE before ram is saved. This eliminates the
>>>> possibility of ram and device state being out of sync, and guarantees
>>>> that a guest in the suspended state remains suspended, because
>>>> qmp_cont
>>>> rejects a cont command in the RUN_STATE_FINISH_MIGRATE state.
>>>>
>>>> Signed-off-by: Steve Sistare<steven.sistare@oracle.com>
>>>> Reviewed-by: Peter Xu<peterx@redhat.com>
>>>> Link:https://urldefense.com/v3/__https://lore.kernel.org/r/1708622920-68779-11-git-send-email-steven.sistare@oracle.com__;!!GjvTz_vk!QLsFOCX-x2U9bzAo98SdidKlomHrmf_t0UmQKtgudoIcaDVoAJOPm39ZqaNP_nT5I8QqVfSgwhDZmg$
>>>> Signed-off-by: Peter Xu<peterx@redhat.com>
>>>> ---
>>>> include/migration/misc.h | 1 +
>>>> migration/migration.h | 2 --
>>>> migration/migration.c | 51
>>>> ++++++++++++++++++++++++----------------
>>>> 3 files changed, 32 insertions(+), 22 deletions(-)
>>>>
>>>> diff --git a/include/migration/misc.h b/include/migration/misc.h
>>>> index e4933b815b..5d1aa593ed 100644
>>>> --- a/include/migration/misc.h
>>>> +++ b/include/migration/misc.h
>>>> @@ -60,6 +60,7 @@ void migration_object_init(void);
>>>> void migration_shutdown(void);
>>>> bool migration_is_idle(void);
>>>> bool migration_is_active(MigrationState *);
>>>> +bool migrate_mode_is_cpr(MigrationState *);
>>>> typedef enum MigrationEventType {
>>>> MIG_EVENT_PRECOPY_SETUP,
>>>> diff --git a/migration/migration.h b/migration/migration.h
>>>> index aef8afbe1f..65c0b61cbd 100644
>>>> --- a/migration/migration.h
>>>> +++ b/migration/migration.h
>>>> @@ -541,6 +541,4 @@ int migration_rp_wait(MigrationState *s);
>>>> */
>>>> void migration_rp_kick(MigrationState *s);
>>>> -int migration_stop_vm(RunState state);
>>>> -
>>>> #endif
>>>> diff --git a/migration/migration.c b/migration/migration.c
>>>> index 37c836b0b0..90a90947fb 100644
>>>> --- a/migration/migration.c
>>>> +++ b/migration/migration.c
>>>> @@ -167,11 +167,19 @@ static gint
>>>> page_request_addr_cmp(gconstpointer ap, gconstpointer bp)
>>>> return (a > b) - (a < b);
>>>> }
>>>> -int migration_stop_vm(RunState state)
>>>> +static int migration_stop_vm(MigrationState *s, RunState state)
>>>> {
>>>> - int ret = vm_stop_force_state(state);
>>>> + int ret;
>>>> +
>>>> + migration_downtime_start(s);
>>>> +
>>>> + s->vm_old_state = runstate_get();
>>>> + global_state_store();
>>>> +
>>>> + ret = vm_stop_force_state(state);
>>>> trace_vmstate_downtime_checkpoint("src-vm-stopped");
>>>> + trace_migration_completion_vm_stop(ret);
>>>> return ret;
>>>> }
>>>> @@ -1602,6 +1610,11 @@ bool migration_is_active(MigrationState *s)
>>>> s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE);
>>>> }
>>>> +bool migrate_mode_is_cpr(MigrationState *s)
>>>> +{
>>>> + return s->parameters.mode == MIG_MODE_CPR_REBOOT;
>>>> +}
>>>> +
>>>> int migrate_init(MigrationState *s, Error **errp)
>>>> {
>>>> int ret;
>>>> @@ -2454,10 +2467,7 @@ static int postcopy_start(MigrationState
>>>> *ms, Error **errp)
>>>> bql_lock();
>>>> trace_postcopy_start_set_run();
>>>> - migration_downtime_start(ms);
>>>> -
>>>> - global_state_store();
>>>> - ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
>>>> + ret = migration_stop_vm(ms, RUN_STATE_FINISH_MIGRATE);
>>>> if (ret < 0) {
>>>> goto fail;
>>>> }
>>>> @@ -2652,15 +2662,12 @@ static int
>>>> migration_completion_precopy(MigrationState *s,
>>>> int ret;
>>>> bql_lock();
>>>> - migration_downtime_start(s);
>>>> -
>>>> - s->vm_old_state = runstate_get();
>>>> - global_state_store();
>>>> - ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
>>>> - trace_migration_completion_vm_stop(ret);
>>>> - if (ret < 0) {
>>>> - goto out_unlock;
>>>> + if (!migrate_mode_is_cpr(s)) {
>>>> + ret = migration_stop_vm(s, RUN_STATE_FINISH_MIGRATE);
>>>> + if (ret < 0) {
>>>> + goto out_unlock;
>>>> + }
>>>> }
>>>> ret = migration_maybe_pause(s, current_active_state,
>>>> @@ -3500,15 +3507,10 @@ static void *bg_migration_thread(void *opaque)
>>>> s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_HOST) -
>>>> setup_start;
>>>> trace_migration_thread_setup_complete();
>>>> - migration_downtime_start(s);
>>>> bql_lock();
>>>> - s->vm_old_state = runstate_get();
>>>> -
>>>> - global_state_store();
>>>> - /* Forcibly stop VM before saving state of vCPUs and devices */
>>>> - if (migration_stop_vm(RUN_STATE_PAUSED)) {
>>>> + if (migration_stop_vm(s, RUN_STATE_PAUSED)) {
>>>> goto fail;
>>>> }
>>>> /*
>>>> @@ -3584,6 +3586,7 @@ void migrate_fd_connect(MigrationState *s,
>>>> Error *error_in)
>>>> Error *local_err = NULL;
>>>> uint64_t rate_limit;
>>>> bool resume = s->state == MIGRATION_STATUS_POSTCOPY_PAUSED;
>>>> + int ret;
>>>> /*
>>>> * If there's a previous error, free it and prepare for
>>>> another one.
>>>> @@ -3655,6 +3658,14 @@ void migrate_fd_connect(MigrationState *s,
>>>> Error *error_in)
>>>> return;
>>>> }
>>>> + if (migrate_mode_is_cpr(s)) {
>>>> + ret = migration_stop_vm(s, RUN_STATE_FINISH_MIGRATE);
>>>> + if (ret < 0) {
>>>> + error_setg(&local_err, "migration_stop_vm failed,
>>>> error %d", -ret);
>>>> + goto fail;
>>>> + }
>>>> + }
>>>> +
>>>> if (migrate_background_snapshot()) {
>>>> qemu_thread_create(&s->thread, "bg_snapshot",
>>>> bg_migration_thread, s, QEMU_THREAD_JOINABLE);
>
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: CPR/liveupdate: test results using prior bug fix
2024-05-14 13:39 ` Michael Galaxy
@ 2024-05-14 13:54 ` Michael Tokarev
2024-05-14 15:33 ` Michael Tokarev
2024-05-16 17:24 ` Michael Galaxy
0 siblings, 2 replies; 37+ messages in thread
From: Michael Tokarev @ 2024-05-14 13:54 UTC (permalink / raw)
To: Michael Galaxy, Steven Sistare, peterx, Peter Maydell, qemu-devel
Cc: David Hildenbrand, Eric Blake, Laurent Vivier, Paolo Bonzini,
Cédric Le Goater, Gerd Hoffmann, Marc-André Lureau,
Markus Armbruster, Fabiano Rosas, Michael S. Tsirkin, Thomas Huth,
Jason Wang, Alex Williamson, Hunt, Joshua
On 5/14/24 16:39, Michael Galaxy wrote:
> Steve,
>
> OK, so it does not look like this bugfix you wrote was included in 8.2.4
> (which was released yesterday). Unfortunately, that means that anyone
> using CPR in that release will still (eventually) encounter the bug like
> I did.
8.2.4 is basically a "bugfix" release for 8.2.3 which I somewhat
screwed up (in a minor way), plus a few currently (at the time)
queued up changes. 8.2.3 was a big release though.
> I would recommend that y'all consider cherry-picking, perhaps, the
> relevant commits for a possible 8.2.5 ?
Please Cc changes which are relevant for -stable to, well,
qemu-stable@nongnu.org :)
Which changes needs to be picked up?
Thanks,
/mjt
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: CPR/liveupdate: test results using prior bug fix
2024-05-14 13:54 ` Michael Tokarev
@ 2024-05-14 15:33 ` Michael Tokarev
2024-05-15 10:58 ` Steven Sistare
2024-05-16 17:24 ` Michael Galaxy
1 sibling, 1 reply; 37+ messages in thread
From: Michael Tokarev @ 2024-05-14 15:33 UTC (permalink / raw)
To: Michael Galaxy, Steven Sistare, peterx, Peter Maydell, qemu-devel
Cc: David Hildenbrand, Eric Blake, Laurent Vivier, Paolo Bonzini,
Cédric Le Goater, Gerd Hoffmann, Marc-André Lureau,
Markus Armbruster, Fabiano Rosas, Michael S. Tsirkin, Thomas Huth,
Jason Wang, Alex Williamson, Hunt, Joshua
14.05.2024 16:54, Michael Tokarev пишет:
> On 5/14/24 16:39, Michael Galaxy wrote:
>> Steve,
>>
>> OK, so it does not look like this bugfix you wrote was included in 8.2.4 (which was released yesterday). Unfortunately, that means that anyone using
>> CPR in that release will still (eventually) encounter the bug like I did.
>
> 8.2.4 is basically a "bugfix" release for 8.2.3 which I somewhat
> screwed up (in a minor way), plus a few currently (at the time)
> queued up changes. 8.2.3 was a big release though.
>
>> I would recommend that y'all consider cherry-picking, perhaps, the relevant commits for a possible 8.2.5 ?
>
> Please Cc changes which are relevant for -stable to, well,
> qemu-stable@nongnu.org :)
>
> Which changes needs to be picked up?
Please also note this particular change does not apply cleanly to
stable-8.2 branch due to other changes in this area between 8.2
and 9.0, in particular, in postcopy_start(). So if this is to be
picked up for stable-8.2, I need help from someone who better
understands this code to select changes to pick.
Thanks,
/mjt
--
GPG Key transition (from rsa2048 to rsa4096) since 2024-04-24.
New key: rsa4096/61AD3D98ECDF2C8E 9D8B E14E 3F2A 9DD7 9199 28F1 61AD 3D98 ECDF 2C8E
Old key: rsa2048/457CE0A0804465C5 6EE1 95D1 886E 8FFB 810D 4324 457C E0A0 8044 65C5
Transition statement: http://www.corpit.ru/mjt/gpg-transition-2024.txt
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: CPR/liveupdate: test results using prior bug fix
2024-05-14 15:33 ` Michael Tokarev
@ 2024-05-15 10:58 ` Steven Sistare
0 siblings, 0 replies; 37+ messages in thread
From: Steven Sistare @ 2024-05-15 10:58 UTC (permalink / raw)
To: Michael Tokarev, Michael Galaxy, peterx, Peter Maydell,
qemu-devel
Cc: David Hildenbrand, Eric Blake, Laurent Vivier, Paolo Bonzini,
Cédric Le Goater, Gerd Hoffmann, Marc-André Lureau,
Markus Armbruster, Fabiano Rosas, Michael S. Tsirkin, Thomas Huth,
Jason Wang, Alex Williamson, Hunt, Joshua
On 5/14/2024 11:33 AM, Michael Tokarev wrote:
> 14.05.2024 16:54, Michael Tokarev пишет:
>> On 5/14/24 16:39, Michael Galaxy wrote:
>>> Steve,
>>>
>>> OK, so it does not look like this bugfix you wrote was included in 8.2.4
>>> (which was released yesterday). Unfortunately, that means that anyone using
>>> CPR in that release will still (eventually) encounter the bug like I did.
>>
>> 8.2.4 is basically a "bugfix" release for 8.2.3 which I somewhat
>> screwed up (in a minor way), plus a few currently (at the time)
>> queued up changes. 8.2.3 was a big release though.
>>
>>> I would recommend that y'all consider cherry-picking, perhaps, the relevant
>>> commits for a possible 8.2.5 ?
>>
>> Please Cc changes which are relevant for -stable to, well,
>> qemu-stable@nongnu.org :)
>>
>> Which changes needs to be picked up?
> Please also note this particular change does not apply cleanly to
> stable-8.2 branch due to other changes in this area between 8.2
> and 9.0, in particular, in postcopy_start(). So if this is to be
> picked up for stable-8.2, I need help from someone who better
> understands this code to select changes to pick.
>
> Thanks,
>
> /mjt
Michael Galaxy, I'm afraid you are out of luck with respect to qemu 8.2.
It has some of the cpr reboot commits, but is missing the following:
87a2848 migration: massage cpr-reboot documentation
cbdafc1 migration: options incompatible with cpr
ce5db1c migration: update cpr-reboot description
9867d4d migration: stop vm for cpr
4af667f migration: notifier error checking
bf78a04 migration: refactor migrate_fd_connect failures
6835f5a migration: per-mode notifiers
5663dd3 migration: MigrationNotifyFunc
c763a23e migration: remove postcopy_after_devices
9d9babf migration: MigrationEvent for notifiers
3e77573 migration: convert to NotifierWithReturn
d91f33c migration: remove error from notifier data
be19d83 notify: pass error to notifier with return
b12635f migration: fix coverity migrate_mode finding
2b58a8b tests/qtest: postcopy migration with suspend
b1fdd21 tests/qtest: precopy migration with suspend
5014478 tests/qtest: option to suspend during migration
f064975 tests/qtest: migration events
49a5020 migration: preserve suspended for bg_migration
58b1057 migration: preserve suspended for snapshot
b4e9ddc migration: preserve suspended runstate
d3c86c99 migration: propagate suspended runstate
9ff5e79 cpus: vm_resume
0f1db06 cpus: check running not RUN_STATE_RUNNING
b9ae473 cpus: stop vm in suspended runstate
f06f316 cpus: vm_was_suspended
All of those landed in qemu 9.0.
- Steve
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: CPR/liveupdate: test results using prior bug fix
2024-05-14 13:54 ` Michael Tokarev
2024-05-14 15:33 ` Michael Tokarev
@ 2024-05-16 17:24 ` Michael Galaxy
2024-05-16 18:07 ` Steven Sistare
1 sibling, 1 reply; 37+ messages in thread
From: Michael Galaxy @ 2024-05-16 17:24 UTC (permalink / raw)
To: Michael Tokarev, Steven Sistare, peterx, Peter Maydell,
qemu-devel
Cc: David Hildenbrand, Eric Blake, Laurent Vivier, Paolo Bonzini,
Cédric Le Goater, Gerd Hoffmann, Marc-André Lureau,
Markus Armbruster, Fabiano Rosas, Michael S. Tsirkin, Thomas Huth,
Jason Wang, Alex Williamson, Hunt, Joshua
On 5/14/24 08:54, Michael Tokarev wrote:
> On 5/14/24 16:39, Michael Galaxy wrote:
>> Steve,
>>
>> OK, so it does not look like this bugfix you wrote was included in
>> 8.2.4 (which was released yesterday). Unfortunately, that means that
>> anyone using CPR in that release will still (eventually) encounter
>> the bug like I did.
>
> 8.2.4 is basically a "bugfix" release for 8.2.3 which I somewhat
> screwed up (in a minor way), plus a few currently (at the time)
> queued up changes. 8.2.3 was a big release though.
>
>> I would recommend that y'all consider cherry-picking, perhaps, the
>> relevant commits for a possible 8.2.5 ?
>
> Please Cc changes which are relevant for -stable to, well,
> qemu-stable@nongnu.org :)
>
> Which changes needs to be picked up?
>
Steve, can you comment here, please? At a minimum, we have this one:
[PULL 20/25] migration: stop vm for cpr
But that pull came with a handful of other changes that are also not in
QEMU v8, so I suspect I'm missing some other important changes that
might be important for a stable release?
- Michael
> Thanks,
>
> /mjt
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: CPR/liveupdate: test results using prior bug fix
2024-05-16 17:24 ` Michael Galaxy
@ 2024-05-16 18:07 ` Steven Sistare
2024-05-17 17:48 ` Michael Galaxy
0 siblings, 1 reply; 37+ messages in thread
From: Steven Sistare @ 2024-05-16 18:07 UTC (permalink / raw)
To: Michael Galaxy, Michael Tokarev, peterx, Peter Maydell,
qemu-devel
Cc: David Hildenbrand, Eric Blake, Laurent Vivier, Paolo Bonzini,
Cédric Le Goater, Gerd Hoffmann, Marc-André Lureau,
Markus Armbruster, Fabiano Rosas, Michael S. Tsirkin, Thomas Huth,
Jason Wang, Alex Williamson, Hunt, Joshua
On 5/16/2024 1:24 PM, Michael Galaxy wrote:
> On 5/14/24 08:54, Michael Tokarev wrote:
>> On 5/14/24 16:39, Michael Galaxy wrote:
>>> Steve,
>>>
>>> OK, so it does not look like this bugfix you wrote was included in 8.2.4
>>> (which was released yesterday). Unfortunately, that means that anyone using
>>> CPR in that release will still (eventually) encounter the bug like I did.
>>
>> 8.2.4 is basically a "bugfix" release for 8.2.3 which I somewhat
>> screwed up (in a minor way), plus a few currently (at the time)
>> queued up changes. 8.2.3 was a big release though.
>>
>>> I would recommend that y'all consider cherry-picking, perhaps, the relevant
>>> commits for a possible 8.2.5 ?
>>
>> Please Cc changes which are relevant for -stable to, well,
>> qemu-stable@nongnu.org :)
>>
>> Which changes needs to be picked up?
>>
> Steve, can you comment here, please? At a minimum, we have this one: [PULL
> 20/25] migration: stop vm for cpr
>
> But that pull came with a handful of other changes that are also not in QEMU v8,
> so I suspect I'm missing some other important changes that might be important
> for a stable release?
>
> - Michael
Hi Michael, I sent the full list of commits to this distribution yesterday, and
I see it in my Sent email folder. Copying verbatim:
----
Michael Galaxy, I'm afraid you are out of luck with respect to qemu 8.2.
It has some of the cpr reboot commits, but is missing the following:
87a2848 migration: massage cpr-reboot documentation
cbdafc1 migration: options incompatible with cpr
ce5db1c migration: update cpr-reboot description
9867d4d migration: stop vm for cpr
4af667f migration: notifier error checking
bf78a04 migration: refactor migrate_fd_connect failures
6835f5a migration: per-mode notifiers
5663dd3 migration: MigrationNotifyFunc
c763a23e migration: remove postcopy_after_devices
9d9babf migration: MigrationEvent for notifiers
3e77573 migration: convert to NotifierWithReturn
d91f33c migration: remove error from notifier data
be19d83 notify: pass error to notifier with return
b12635f migration: fix coverity migrate_mode finding
2b58a8b tests/qtest: postcopy migration with suspend
b1fdd21 tests/qtest: precopy migration with suspend
5014478 tests/qtest: option to suspend during migration
f064975 tests/qtest: migration events
49a5020 migration: preserve suspended for bg_migration
58b1057 migration: preserve suspended for snapshot
b4e9ddc migration: preserve suspended runstate
d3c86c99 migration: propagate suspended runstate
9ff5e79 cpus: vm_resume
0f1db06 cpus: check running not RUN_STATE_RUNNING
b9ae473 cpus: stop vm in suspended runstate
f06f316 cpus: vm_was_suspended
All of those landed in qemu 9.0.
---
- Steve
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: CPR/liveupdate: test results using prior bug fix
2024-05-16 18:07 ` Steven Sistare
@ 2024-05-17 17:48 ` Michael Galaxy
0 siblings, 0 replies; 37+ messages in thread
From: Michael Galaxy @ 2024-05-17 17:48 UTC (permalink / raw)
To: Steven Sistare, Michael Tokarev, peterx, Peter Maydell,
qemu-devel
Cc: David Hildenbrand, Eric Blake, Laurent Vivier, Paolo Bonzini,
Cédric Le Goater, Gerd Hoffmann, Marc-André Lureau,
Markus Armbruster, Fabiano Rosas, Michael S. Tsirkin, Thomas Huth,
Jason Wang, Alex Williamson, Hunt, Joshua
OK, acknowledged. Thanks, All.
- Michael
On 5/16/24 13:07, Steven Sistare wrote:
> On 5/16/2024 1:24 PM, Michael Galaxy wrote:
>> On 5/14/24 08:54, Michael Tokarev wrote:
>>> On 5/14/24 16:39, Michael Galaxy wrote:
>>>> Steve,
>>>>
>>>> OK, so it does not look like this bugfix you wrote was included in
>>>> 8.2.4 (which was released yesterday). Unfortunately, that means
>>>> that anyone using CPR in that release will still (eventually)
>>>> encounter the bug like I did.
>>>
>>> 8.2.4 is basically a "bugfix" release for 8.2.3 which I somewhat
>>> screwed up (in a minor way), plus a few currently (at the time)
>>> queued up changes. 8.2.3 was a big release though.
>>>
>>>> I would recommend that y'all consider cherry-picking, perhaps, the
>>>> relevant commits for a possible 8.2.5 ?
>>>
>>> Please Cc changes which are relevant for -stable to, well,
>>> qemu-stable@nongnu.org :)
>>>
>>> Which changes needs to be picked up?
>>>
>> Steve, can you comment here, please? At a minimum, we have this one:
>> [PULL 20/25] migration: stop vm for cpr
>>
>> But that pull came with a handful of other changes that are also not
>> in QEMU v8, so I suspect I'm missing some other important changes
>> that might be important for a stable release?
>>
>> - Michael
>
> Hi Michael, I sent the full list of commits to this distribution
> yesterday, and
> I see it in my Sent email folder. Copying verbatim:
>
> ----
> Michael Galaxy, I'm afraid you are out of luck with respect to qemu 8.2.
> It has some of the cpr reboot commits, but is missing the following:
>
> 87a2848 migration: massage cpr-reboot documentation
> cbdafc1 migration: options incompatible with cpr
> ce5db1c migration: update cpr-reboot description
> 9867d4d migration: stop vm for cpr
> 4af667f migration: notifier error checking
> bf78a04 migration: refactor migrate_fd_connect failures
> 6835f5a migration: per-mode notifiers
> 5663dd3 migration: MigrationNotifyFunc
> c763a23e migration: remove postcopy_after_devices
> 9d9babf migration: MigrationEvent for notifiers
> 3e77573 migration: convert to NotifierWithReturn
> d91f33c migration: remove error from notifier data
> be19d83 notify: pass error to notifier with return
> b12635f migration: fix coverity migrate_mode finding
> 2b58a8b tests/qtest: postcopy migration with suspend
> b1fdd21 tests/qtest: precopy migration with suspend
> 5014478 tests/qtest: option to suspend during migration
> f064975 tests/qtest: migration events
> 49a5020 migration: preserve suspended for bg_migration
> 58b1057 migration: preserve suspended for snapshot
> b4e9ddc migration: preserve suspended runstate
> d3c86c99 migration: propagate suspended runstate
> 9ff5e79 cpus: vm_resume
> 0f1db06 cpus: check running not RUN_STATE_RUNNING
> b9ae473 cpus: stop vm in suspended runstate
> f06f316 cpus: vm_was_suspended
>
> All of those landed in qemu 9.0.
> ---
>
> - Steve
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PULL 21/25] migration: update cpr-reboot description
2024-02-28 5:12 [PULL 00/25] Migration next patches peterx
` (19 preceding siblings ...)
2024-02-28 5:13 ` [PULL 20/25] migration: stop vm for cpr peterx
@ 2024-02-28 5:13 ` peterx
2024-02-28 5:13 ` [PULL 22/25] migration: options incompatible with cpr peterx
` (4 subsequent siblings)
25 siblings, 0 replies; 37+ messages in thread
From: peterx @ 2024-02-28 5:13 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: David Hildenbrand, Eric Blake, Laurent Vivier, Paolo Bonzini,
Cédric Le Goater, Gerd Hoffmann, Marc-André Lureau,
Markus Armbruster, Fabiano Rosas, Michael S. Tsirkin, Thomas Huth,
Jason Wang, Alex Williamson, Peter Xu, Steve Sistare
From: Steve Sistare <steven.sistare@oracle.com>
Clarify qapi for cpr-reboot migration mode, and add vfio support.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1708622920-68779-14-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
qapi/migration.json | 35 ++++++++++++++++++++++-------------
1 file changed, 22 insertions(+), 13 deletions(-)
diff --git a/qapi/migration.json b/qapi/migration.json
index 7303e57e8e..bee5e71fe3 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -636,19 +636,28 @@
#
# @normal: the original form of migration. (since 8.2)
#
-# @cpr-reboot: The migrate command saves state to a file, allowing one to
-# quit qemu, reboot to an updated kernel, and restart an updated
-# version of qemu. The caller must specify a migration URI
-# that writes to and reads from a file. Unlike normal mode,
-# the use of certain local storage options does not block the
-# migration, but the caller must not modify guest block devices
-# between the quit and restart. To avoid saving guest RAM to the
-# file, the memory backend must be shared, and the @x-ignore-shared
-# migration capability must be set. Guest RAM must be non-volatile
-# across reboot, such as by backing it with a dax device, but this
-# is not enforced. The restarted qemu arguments must match those
-# used to initially start qemu, plus the -incoming option.
-# (since 8.2)
+# @cpr-reboot: The migrate command stops the VM and saves state to the URI.
+# After quitting qemu, the user resumes by running qemu -incoming.
+#
+# This mode allows the user to quit qemu, and restart an updated version
+# of qemu. The user may even update and reboot the OS before restarting,
+# as long as the URI persists across a reboot.
+#
+# Unlike normal mode, the use of certain local storage options does not
+# block the migration, but the user must not modify guest block devices
+# between the quit and restart.
+#
+# This mode supports vfio devices provided the user first puts the guest
+# in the suspended runstate, such as by issuing guest-suspend-ram to the
+# qemu guest agent.
+#
+# Best performance is achieved when the memory backend is shared and the
+# @x-ignore-shared migration capability is set, but this is not required.
+# Further, if the user reboots before restarting such a configuration, the
+# shared backend must be be non-volatile across reboot, such as by backing
+# it with a dax device.
+#
+# (since 8.2)
##
{ 'enum': 'MigMode',
'data': [ 'normal', 'cpr-reboot' ] }
--
2.43.0
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PULL 22/25] migration: options incompatible with cpr
2024-02-28 5:12 [PULL 00/25] Migration next patches peterx
` (20 preceding siblings ...)
2024-02-28 5:13 ` [PULL 21/25] migration: update cpr-reboot description peterx
@ 2024-02-28 5:13 ` peterx
2024-02-28 5:13 ` [PULL 23/25] migration: Fix qmp_query_migrate mbps value peterx
` (3 subsequent siblings)
25 siblings, 0 replies; 37+ messages in thread
From: peterx @ 2024-02-28 5:13 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: David Hildenbrand, Eric Blake, Laurent Vivier, Paolo Bonzini,
Cédric Le Goater, Gerd Hoffmann, Marc-André Lureau,
Markus Armbruster, Fabiano Rosas, Michael S. Tsirkin, Thomas Huth,
Jason Wang, Alex Williamson, Peter Xu, Steve Sistare
From: Steve Sistare <steven.sistare@oracle.com>
Fail the migration request if options are set that are incompatible
with cpr.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1708622920-68779-15-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
qapi/migration.json | 2 ++
migration/migration.c | 17 +++++++++++++++++
2 files changed, 19 insertions(+)
diff --git a/qapi/migration.json b/qapi/migration.json
index bee5e71fe3..0b33a71ab4 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -657,6 +657,8 @@
# shared backend must be be non-volatile across reboot, such as by backing
# it with a dax device.
#
+# cpr-reboot may not be used with postcopy, colo, or background-snapshot.
+#
# (since 8.2)
##
{ 'enum': 'MigMode',
diff --git a/migration/migration.c b/migration/migration.c
index 90a90947fb..7652fd4d14 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1953,6 +1953,23 @@ static bool migrate_prepare(MigrationState *s, bool blk, bool blk_inc,
return false;
}
+ if (migrate_mode_is_cpr(s)) {
+ const char *conflict = NULL;
+
+ if (migrate_postcopy()) {
+ conflict = "postcopy";
+ } else if (migrate_background_snapshot()) {
+ conflict = "background snapshot";
+ } else if (migrate_colo()) {
+ conflict = "COLO";
+ }
+
+ if (conflict) {
+ error_setg(errp, "Cannot use %s with CPR", conflict);
+ return false;
+ }
+ }
+
if (blk || blk_inc) {
if (migrate_colo()) {
error_setg(errp, "No disk migration is required in COLO mode");
--
2.43.0
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PULL 23/25] migration: Fix qmp_query_migrate mbps value
2024-02-28 5:12 [PULL 00/25] Migration next patches peterx
` (21 preceding siblings ...)
2024-02-28 5:13 ` [PULL 22/25] migration: options incompatible with cpr peterx
@ 2024-02-28 5:13 ` peterx
2024-02-28 5:13 ` [PULL 24/25] migration: Join the return path thread before releasing to_dst_file peterx
` (2 subsequent siblings)
25 siblings, 0 replies; 37+ messages in thread
From: peterx @ 2024-02-28 5:13 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: David Hildenbrand, Eric Blake, Laurent Vivier, Paolo Bonzini,
Cédric Le Goater, Gerd Hoffmann, Marc-André Lureau,
Markus Armbruster, Fabiano Rosas, Michael S. Tsirkin, Thomas Huth,
Jason Wang, Alex Williamson, Peter Xu
From: Fabiano Rosas <farosas@suse.de>
The QMP command query_migrate might see incorrect throughput numbers
if it runs after we've set the migration completion status but before
migration_calculate_complete() has updated s->total_time and s->mbps.
The migration status would show COMPLETED, but the throughput value
would be the one from the last iteration and not the one from the
whole migration. This will usually be a larger value due to the time
period being smaller (one iteration).
Move migration_calculate_complete() earlier so that the status
MIGRATION_STATUS_COMPLETED is only emitted after the final counters
update. Keep everything under the BQL so the QMP thread sees the
updates as atomic.
Rename migration_calculate_complete to migration_completion_end to
reflect its new purpose of also updating s->state.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240226143335.14282-1-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/migration.c | 23 ++++++++++++++---------
1 file changed, 14 insertions(+), 9 deletions(-)
diff --git a/migration/migration.c b/migration/migration.c
index 7652fd4d14..ccb13fa94a 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -107,6 +107,7 @@ static int migration_maybe_pause(MigrationState *s,
int new_state);
static void migrate_fd_cancel(MigrationState *s);
static bool close_return_path_on_source(MigrationState *s);
+static void migration_completion_end(MigrationState *s);
static void migration_downtime_start(MigrationState *s)
{
@@ -2787,8 +2788,7 @@ static void migration_completion(MigrationState *s)
migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
MIGRATION_STATUS_COLO);
} else {
- migrate_set_state(&s->state, current_active_state,
- MIGRATION_STATUS_COMPLETED);
+ migration_completion_end(s);
}
return;
@@ -2825,8 +2825,7 @@ static void bg_migration_completion(MigrationState *s)
goto fail;
}
- migrate_set_state(&s->state, current_active_state,
- MIGRATION_STATUS_COMPLETED);
+ migration_completion_end(s);
return;
fail:
@@ -3028,18 +3027,28 @@ static MigThrError migration_detect_error(MigrationState *s)
}
}
-static void migration_calculate_complete(MigrationState *s)
+static void migration_completion_end(MigrationState *s)
{
uint64_t bytes = migration_transferred_bytes();
int64_t end_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
int64_t transfer_time;
+ /*
+ * Take the BQL here so that query-migrate on the QMP thread sees:
+ * - atomic update of s->total_time and s->mbps;
+ * - correct ordering of s->mbps update vs. s->state;
+ */
+ bql_lock();
migration_downtime_end(s);
s->total_time = end_time - s->start_time;
transfer_time = s->total_time - s->setup_time;
if (transfer_time) {
s->mbps = ((double) bytes * 8.0) / transfer_time / 1000;
}
+
+ migrate_set_state(&s->state, s->state,
+ MIGRATION_STATUS_COMPLETED);
+ bql_unlock();
}
static void update_iteration_initial_status(MigrationState *s)
@@ -3186,7 +3195,6 @@ static void migration_iteration_finish(MigrationState *s)
bql_lock();
switch (s->state) {
case MIGRATION_STATUS_COMPLETED:
- migration_calculate_complete(s);
runstate_set(RUN_STATE_POSTMIGRATE);
break;
case MIGRATION_STATUS_COLO:
@@ -3230,9 +3238,6 @@ static void bg_migration_iteration_finish(MigrationState *s)
bql_lock();
switch (s->state) {
case MIGRATION_STATUS_COMPLETED:
- migration_calculate_complete(s);
- break;
-
case MIGRATION_STATUS_ACTIVE:
case MIGRATION_STATUS_FAILED:
case MIGRATION_STATUS_CANCELLED:
--
2.43.0
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PULL 24/25] migration: Join the return path thread before releasing to_dst_file
2024-02-28 5:12 [PULL 00/25] Migration next patches peterx
` (22 preceding siblings ...)
2024-02-28 5:13 ` [PULL 23/25] migration: Fix qmp_query_migrate mbps value peterx
@ 2024-02-28 5:13 ` peterx
2024-02-28 5:13 ` [PULL 25/25] migration: Use migrate_has_error() in close_return_path_on_source() peterx
2024-02-29 15:24 ` [PULL 00/25] Migration next patches Peter Maydell
25 siblings, 0 replies; 37+ messages in thread
From: peterx @ 2024-02-28 5:13 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: David Hildenbrand, Eric Blake, Laurent Vivier, Paolo Bonzini,
Cédric Le Goater, Gerd Hoffmann, Marc-André Lureau,
Markus Armbruster, Fabiano Rosas, Michael S. Tsirkin, Thomas Huth,
Jason Wang, Alex Williamson, Peter Xu
From: Fabiano Rosas <farosas@suse.de>
The return path thread might hang at a blocking system call. Before
joining the thread we might need to issue a shutdown() on the socket
file descriptor to release it. To determine whether the shutdown() is
necessary we look at the QEMUFile error.
Make sure we only clean up the QEMUFile after the return path has been
waited for.
This fixes a hang when qemu_savevm_state_setup() produced an error
that was detected by migration_detect_error(). That skips
migration_completion() so close_return_path_on_source() would get
stuck waiting for the RP thread to terminate.
Reported-by: Cédric Le Goater <clg@redhat.com>
Tested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240226203122.22894-2-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/migration.c | 22 +++++++++-------------
1 file changed, 9 insertions(+), 13 deletions(-)
diff --git a/migration/migration.c b/migration/migration.c
index ccb13fa94a..7ba2b60e46 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1342,6 +1342,8 @@ static void migrate_fd_cleanup(MigrationState *s)
qemu_savevm_state_cleanup();
+ close_return_path_on_source(s);
+
if (s->to_dst_file) {
QEMUFile *tmp;
@@ -1366,12 +1368,6 @@ static void migrate_fd_cleanup(MigrationState *s)
qemu_fclose(tmp);
}
- /*
- * We already cleaned up to_dst_file, so errors from the return
- * path might be due to that, ignore them.
- */
- close_return_path_on_source(s);
-
assert(!migration_is_active(s));
if (s->state == MIGRATION_STATUS_CANCELLING) {
@@ -2914,6 +2910,13 @@ static MigThrError postcopy_pause(MigrationState *s)
while (true) {
QEMUFile *file;
+ /*
+ * We're already pausing, so ignore any errors on the return
+ * path and just wait for the thread to finish. It will be
+ * re-created when we resume.
+ */
+ close_return_path_on_source(s);
+
/*
* Current channel is possibly broken. Release it. Note that this is
* guaranteed even without lock because to_dst_file should only be
@@ -2933,13 +2936,6 @@ static MigThrError postcopy_pause(MigrationState *s)
qemu_file_shutdown(file);
qemu_fclose(file);
- /*
- * We're already pausing, so ignore any errors on the return
- * path and just wait for the thread to finish. It will be
- * re-created when we resume.
- */
- close_return_path_on_source(s);
-
migrate_set_state(&s->state, s->state,
MIGRATION_STATUS_POSTCOPY_PAUSED);
--
2.43.0
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PULL 25/25] migration: Use migrate_has_error() in close_return_path_on_source()
2024-02-28 5:12 [PULL 00/25] Migration next patches peterx
` (23 preceding siblings ...)
2024-02-28 5:13 ` [PULL 24/25] migration: Join the return path thread before releasing to_dst_file peterx
@ 2024-02-28 5:13 ` peterx
2024-02-29 15:24 ` [PULL 00/25] Migration next patches Peter Maydell
25 siblings, 0 replies; 37+ messages in thread
From: peterx @ 2024-02-28 5:13 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: David Hildenbrand, Eric Blake, Laurent Vivier, Paolo Bonzini,
Cédric Le Goater, Gerd Hoffmann, Marc-André Lureau,
Markus Armbruster, Fabiano Rosas, Michael S. Tsirkin, Thomas Huth,
Jason Wang, Alex Williamson, Peter Xu
From: Cédric Le Goater <clg@redhat.com>
close_return_path_on_source() retrieves the migration error from the
the QEMUFile '->to_dst_file' to know if a shutdown is required. This
shutdown is required to exit the return-path thread.
Avoid relying on '->to_dst_file' and use migrate_has_error() instead.
(using to_dst_file is a heuristic to infer whether
rp_state.from_dst_file might be stuck on a recvmsg(). Using a generic
method for detecting errors is more reliable. We also want to reduce
dependency on QEMUFile::last_error)
Suggested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
[added some words about the motivation for this patch]
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240226203122.22894-3-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/migration.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/migration/migration.c b/migration/migration.c
index 7ba2b60e46..bab68bcbef 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2429,8 +2429,7 @@ static bool close_return_path_on_source(MigrationState *ms)
* cause it to unblock if it's stuck waiting for the destination.
*/
WITH_QEMU_LOCK_GUARD(&ms->qemu_file_lock) {
- if (ms->to_dst_file && ms->rp_state.from_dst_file &&
- qemu_file_get_error(ms->to_dst_file)) {
+ if (migrate_has_error(ms) && ms->rp_state.from_dst_file) {
qemu_file_shutdown(ms->rp_state.from_dst_file);
}
}
--
2.43.0
^ permalink raw reply related [flat|nested] 37+ messages in thread
* Re: [PULL 00/25] Migration next patches
2024-02-28 5:12 [PULL 00/25] Migration next patches peterx
` (24 preceding siblings ...)
2024-02-28 5:13 ` [PULL 25/25] migration: Use migrate_has_error() in close_return_path_on_source() peterx
@ 2024-02-29 15:24 ` Peter Maydell
25 siblings, 0 replies; 37+ messages in thread
From: Peter Maydell @ 2024-02-29 15:24 UTC (permalink / raw)
To: peterx
Cc: qemu-devel, David Hildenbrand, Eric Blake, Laurent Vivier,
Paolo Bonzini, Cédric Le Goater, Gerd Hoffmann,
Marc-André Lureau, Markus Armbruster, Fabiano Rosas,
Michael S. Tsirkin, Thomas Huth, Jason Wang, Alex Williamson
On Wed, 28 Feb 2024 at 05:13, <peterx@redhat.com> wrote:
>
> From: Peter Xu <peterx@redhat.com>
>
> The following changes since commit 158a054c4d1a40179f5e83cd7e1cfe65de457b92:
>
> Merge tag 'pull-target-arm-20240227-1' of https://git.linaro.org/people/pmaydell/qemu-arm into staging (2024-02-27 15:34:41 +0000)
>
> are available in the Git repository at:
>
> https://gitlab.com/peterx/qemu.git tags/migration-next-pull-request
>
> for you to fetch changes up to 9425ef3f990a42b98329d5059362f40714e70442:
>
> migration: Use migrate_has_error() in close_return_path_on_source() (2024-02-28 11:31:28 +0800)
>
> ----------------------------------------------------------------
> Migration pull request
>
> - Fabiano's fixed-ram patches (1-5 only)
> - Peter's cleanups on multifd tls IOC referencing
> - Steve's cpr patches for vfio (migration patches only)
> - Fabiano's fix on mbps stats racing with COMPLETE state
> - Fabiano's fix on return path thread hang
>
Applied, thanks.
Please update the changelog at https://wiki.qemu.org/ChangeLog/9.0
for any user-visible changes.
-- PMM
^ permalink raw reply [flat|nested] 37+ messages in thread