* [PATCH for-11.0 v2 0/7] migration: Error reporting cleanups
@ 2025-12-01 19:45 Peter Xu
2025-12-01 19:45 ` [PATCH for-11.0 v2 1/7] migration: Use explicit error_free() instead of g_autoptr Peter Xu
` (9 more replies)
0 siblings, 10 replies; 18+ messages in thread
From: Peter Xu @ 2025-12-01 19:45 UTC (permalink / raw)
To: qemu-devel
Cc: Daniel P . Berrangé, Fabiano Rosas,
Vladimir Sementsov-Ogievskiy, Markus Armbruster,
Marc-André Lureau, peterx, Juraj Marcin
Based-on: <20251125070554.2256181-1-armbru@redhat.com>
This series is based on Markus's recent fix:
[PATCH] migration: Fix double-free on error path
https://lore.kernel.org/r/20251125070554.2256181-1-armbru@redhat.com
v2:
- Added R-bs
- Patch 1:
- update commit message on s/accidentally merged/merged without proper
review/ [Markus]
- Patch 2:
- Added a new follow up patch here from Markus to poison Error's autoptr
- Patch 3:
- Rename migration_connect_set_error to migration_connect_error_propagate
[Markus]
- Add comments in commit log for both migrate_connect() and the rename
[Markus]
- Patch 4:
- Rename multifd_send_set_error to multifd_send_error_propagate [Markus]
- Patch 6:
- Make migrate_error_propagate() take MigrationState* as before [Markus]
- Remove the one use case of g_clear_pointer() [Markus]
- Touch up commit message for the change
This series should address the issues discussed in this thread here:
https://lore.kernel.org/r/871plmk1bc.fsf@pond.sub.org
The problem is Error is not a good candidate of g_autoptr, however the
cleanup function was merged without enough review. Luckily, we only have
two users so far (after Markus's patch above lands). This series removes
the last two in migration code and reverts the auto cleanup function for
Error. Instead, poison the auto cleanup function.
When at it, it'll also change migrate_set_error() to start taking ownership
of errors, just like what most error APIs do. When at it, it is renamed to
migrate_error_propagate() to imply migration version of error_propagate().
Comments welcomed, thanks.
Markus Armbruster (1):
error: Poison g_autoptr(Error) to prevent its use
Peter Xu (6):
migration: Use explicit error_free() instead of g_autoptr
Revert "error: define g_autoptr() cleanup function for the Error type"
migration: Make migration_connect_set_error() own the error
migration: Make multifd_send_set_error() own the error
migration: Make multifd_recv_terminate_threads() own the error
migration: Replace migrate_set_error() with migrate_error_propagate()
include/qapi/error.h | 20 ++++++++++++-
migration/migration.h | 2 +-
migration/channel.c | 1 -
migration/cpr-exec.c | 5 ++--
migration/migration.c | 51 +++++++++++++++-----------------
migration/multifd-device-state.c | 6 ++--
migration/multifd.c | 30 +++++++++----------
migration/postcopy-ram.c | 5 ++--
migration/ram.c | 4 +--
migration/savevm.c | 17 +++++------
10 files changed, 73 insertions(+), 68 deletions(-)
--
2.50.1
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH for-11.0 v2 1/7] migration: Use explicit error_free() instead of g_autoptr
2025-12-01 19:45 [PATCH for-11.0 v2 0/7] migration: Error reporting cleanups Peter Xu
@ 2025-12-01 19:45 ` Peter Xu
2025-12-01 19:45 ` [PATCH for-11.0 v2 2/7] Revert "error: define g_autoptr() cleanup function for the Error type" Peter Xu
` (8 subsequent siblings)
9 siblings, 0 replies; 18+ messages in thread
From: Peter Xu @ 2025-12-01 19:45 UTC (permalink / raw)
To: qemu-devel
Cc: Daniel P . Berrangé, Fabiano Rosas,
Vladimir Sementsov-Ogievskiy, Markus Armbruster,
Marc-André Lureau, peterx, Juraj Marcin
There're only two use cases of g_autoptr to free Error objects in migration
code paths.
Due to the nature of how Error should be used (normally ownership will be
passed over to Error APIs, like error_report_err), auto-free functions may
be error prone on its own. The auto cleanup function was merged without
proper review, as pointed out by Dan and Markus:
https://lore.kernel.org/r/aSWSLMi6ZhTCS_p2@redhat.com
Remove the two use cases so that we can remove the auto cleanup function,
hence suggest to not use auto frees for Errors.
Suggested-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/multifd-device-state.c | 3 ++-
migration/savevm.c | 3 ++-
2 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/migration/multifd-device-state.c b/migration/multifd-device-state.c
index fce64f00b0..db3239fef5 100644
--- a/migration/multifd-device-state.c
+++ b/migration/multifd-device-state.c
@@ -140,7 +140,7 @@ static void multifd_device_state_save_thread_data_free(void *opaque)
static int multifd_device_state_save_thread(void *opaque)
{
SaveCompletePrecopyThreadData *data = opaque;
- g_autoptr(Error) local_err = NULL;
+ Error *local_err = NULL;
if (!data->hdlr(data, &local_err)) {
MigrationState *s = migrate_get_current();
@@ -159,6 +159,7 @@ static int multifd_device_state_save_thread(void *opaque)
* return we end setting is purely arbitrary.
*/
migrate_set_error(s, local_err);
+ error_free(local_err);
}
return 0;
diff --git a/migration/savevm.c b/migration/savevm.c
index 62cc2ce25c..638e9b364f 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2823,7 +2823,7 @@ static int qemu_loadvm_load_thread(void *thread_opaque)
{
struct LoadThreadData *data = thread_opaque;
MigrationIncomingState *mis = migration_incoming_get_current();
- g_autoptr(Error) local_err = NULL;
+ Error *local_err = NULL;
if (!data->function(data->opaque, &mis->load_threads_abort, &local_err)) {
MigrationState *s = migrate_get_current();
@@ -2841,6 +2841,7 @@ static int qemu_loadvm_load_thread(void *thread_opaque)
* return we end setting is purely arbitrary.
*/
migrate_set_error(s, local_err);
+ error_free(local_err);
}
return 0;
--
2.50.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH for-11.0 v2 2/7] Revert "error: define g_autoptr() cleanup function for the Error type"
2025-12-01 19:45 [PATCH for-11.0 v2 0/7] migration: Error reporting cleanups Peter Xu
2025-12-01 19:45 ` [PATCH for-11.0 v2 1/7] migration: Use explicit error_free() instead of g_autoptr Peter Xu
@ 2025-12-01 19:45 ` Peter Xu
2025-12-02 7:53 ` Cédric Le Goater
2025-12-01 19:45 ` [PATCH for-11.0 v2 3/7] error: Poison g_autoptr(Error) to prevent its use Peter Xu
` (7 subsequent siblings)
9 siblings, 1 reply; 18+ messages in thread
From: Peter Xu @ 2025-12-01 19:45 UTC (permalink / raw)
To: qemu-devel
Cc: Daniel P . Berrangé, Fabiano Rosas,
Vladimir Sementsov-Ogievskiy, Markus Armbruster,
Marc-André Lureau, peterx, Juraj Marcin,
Cédric Le Goater, Maciej S. Szmigiero
This reverts commit 18eb55546a54e443d94a4c49286348176ad4b00a.
Due to the nature of how Error should be used (normally ownership will be
passed over to Error APIs, like error_report_err), auto-free functions may
be error prone on its own. The auto cleanup function was merged without
proper review as pointed out by Dan and Markus:
https://lore.kernel.org/r/aSWSLMi6ZhTCS_p2@redhat.com
Cc: Cédric Le Goater <clg@redhat.com>
Acked-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
include/qapi/error.h | 2 --
1 file changed, 2 deletions(-)
diff --git a/include/qapi/error.h b/include/qapi/error.h
index b16c6303f8..f3ce4a4a2d 100644
--- a/include/qapi/error.h
+++ b/include/qapi/error.h
@@ -437,8 +437,6 @@ Error *error_copy(const Error *err);
*/
void error_free(Error *err);
-G_DEFINE_AUTOPTR_CLEANUP_FUNC(Error, error_free)
-
/*
* Convenience function to assert that *@errp is set, then silently free it.
*/
--
2.50.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH for-11.0 v2 3/7] error: Poison g_autoptr(Error) to prevent its use
2025-12-01 19:45 [PATCH for-11.0 v2 0/7] migration: Error reporting cleanups Peter Xu
2025-12-01 19:45 ` [PATCH for-11.0 v2 1/7] migration: Use explicit error_free() instead of g_autoptr Peter Xu
2025-12-01 19:45 ` [PATCH for-11.0 v2 2/7] Revert "error: define g_autoptr() cleanup function for the Error type" Peter Xu
@ 2025-12-01 19:45 ` Peter Xu
2025-12-01 19:45 ` [PATCH for-11.0 v2 4/7] migration: Make migration_connect_set_error() own the error Peter Xu
` (6 subsequent siblings)
9 siblings, 0 replies; 18+ messages in thread
From: Peter Xu @ 2025-12-01 19:45 UTC (permalink / raw)
To: qemu-devel
Cc: Daniel P . Berrangé, Fabiano Rosas,
Vladimir Sementsov-Ogievskiy, Markus Armbruster,
Marc-André Lureau, peterx, Juraj Marcin, Peter Maydell,
Cédric Le Goater
From: Markus Armbruster <armbru@redhat.com>
The previous commit reverted support for g_autoptr(Error). This one
should stop it from coming back.
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Tested-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
include/qapi/error.h | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/include/qapi/error.h b/include/qapi/error.h
index f3ce4a4a2d..2356b84bb3 100644
--- a/include/qapi/error.h
+++ b/include/qapi/error.h
@@ -437,6 +437,26 @@ Error *error_copy(const Error *err);
*/
void error_free(Error *err);
+/*
+ * Poison g_autoptr(Error) to prevent its use.
+ *
+ * Functions that report or propagate an error take ownership of the
+ * Error object. Explicit error_free() is needed when you handle an
+ * error in some other way. This is rare.
+ *
+ * g_autoptr(Error) would call error_free() automatically on return.
+ * To avoid a double-free, we'd have to manually clear the pointer
+ * every time we propagate or report.
+ *
+ * Thus, g_autoptr(Error) would make the rare case easier to get right
+ * (less prone to leaks), and the common case easier to get wrong
+ * (more prone to double-free).
+ */
+extern void
+__attribute__((error("Do not use g_autoptr() to declare Error * variables")))
+error_free_poisoned(Error *err);
+G_DEFINE_AUTOPTR_CLEANUP_FUNC(Error, error_free_poisoned)
+
/*
* Convenience function to assert that *@errp is set, then silently free it.
*/
--
2.50.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH for-11.0 v2 4/7] migration: Make migration_connect_set_error() own the error
2025-12-01 19:45 [PATCH for-11.0 v2 0/7] migration: Error reporting cleanups Peter Xu
` (2 preceding siblings ...)
2025-12-01 19:45 ` [PATCH for-11.0 v2 3/7] error: Poison g_autoptr(Error) to prevent its use Peter Xu
@ 2025-12-01 19:45 ` Peter Xu
2025-12-02 11:42 ` Markus Armbruster
2025-12-01 19:45 ` [PATCH for-11.0 v2 5/7] migration: Make multifd_send_set_error() " Peter Xu
` (5 subsequent siblings)
9 siblings, 1 reply; 18+ messages in thread
From: Peter Xu @ 2025-12-01 19:45 UTC (permalink / raw)
To: qemu-devel
Cc: Daniel P . Berrangé, Fabiano Rosas,
Vladimir Sementsov-Ogievskiy, Markus Armbruster,
Marc-André Lureau, peterx, Juraj Marcin
Make migration_connect_set_error() take ownership of the error always.
Paving way for making migrate_set_error() to take ownership.
When at it, renaming it to migration_connect_error_propagate(), following
Error API, to imply the Error object ownership transition.
NOTE: this patch also makes migration_connect() to take ownership of the
Error passed in.
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/channel.c | 1 -
migration/migration.c | 9 +++++----
2 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/migration/channel.c b/migration/channel.c
index 462cc183e1..92435fa7f7 100644
--- a/migration/channel.c
+++ b/migration/channel.c
@@ -95,7 +95,6 @@ void migration_channel_connect(MigrationState *s,
}
}
migration_connect(s, error);
- error_free(error);
}
diff --git a/migration/migration.c b/migration/migration.c
index b316ee01ab..0ff8b31a88 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1575,7 +1575,7 @@ static void migrate_error_free(MigrationState *s)
}
}
-static void migration_connect_set_error(MigrationState *s, const Error *error)
+static void migration_connect_error_propagate(MigrationState *s, Error *error)
{
MigrationStatus current = s->state;
MigrationStatus next;
@@ -1602,6 +1602,7 @@ static void migration_connect_set_error(MigrationState *s, const Error *error)
migrate_set_state(&s->state, current, next);
migrate_set_error(s, error);
+ error_free(error);
}
void migration_cancel(void)
@@ -2292,7 +2293,7 @@ void qmp_migrate(const char *uri, bool has_channels,
out:
if (local_err) {
- migration_connect_set_error(s, local_err);
+ migration_connect_error_propagate(s, error_copy(local_err));
error_propagate(errp, local_err);
}
}
@@ -2337,7 +2338,7 @@ static void qmp_migrate_finish(MigrationAddress *addr, bool resume_requested,
if (!resume_requested) {
yank_unregister_instance(MIGRATION_YANK_INSTANCE);
}
- migration_connect_set_error(s, local_err);
+ migration_connect_error_propagate(s, error_copy(local_err));
error_propagate(errp, local_err);
return;
}
@@ -4039,7 +4040,7 @@ void migration_connect(MigrationState *s, Error *error_in)
s->expected_downtime = migrate_downtime_limit();
if (error_in) {
- migration_connect_set_error(s, error_in);
+ migration_connect_error_propagate(s, error_in);
if (resume) {
/*
* Don't do cleanup for resume if channel is invalid, but only dump
--
2.50.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH for-11.0 v2 5/7] migration: Make multifd_send_set_error() own the error
2025-12-01 19:45 [PATCH for-11.0 v2 0/7] migration: Error reporting cleanups Peter Xu
` (3 preceding siblings ...)
2025-12-01 19:45 ` [PATCH for-11.0 v2 4/7] migration: Make migration_connect_set_error() own the error Peter Xu
@ 2025-12-01 19:45 ` Peter Xu
2025-12-01 19:45 ` [PATCH for-11.0 v2 6/7] migration: Make multifd_recv_terminate_threads() " Peter Xu
` (4 subsequent siblings)
9 siblings, 0 replies; 18+ messages in thread
From: Peter Xu @ 2025-12-01 19:45 UTC (permalink / raw)
To: qemu-devel
Cc: Daniel P . Berrangé, Fabiano Rosas,
Vladimir Sementsov-Ogievskiy, Markus Armbruster,
Marc-André Lureau, peterx, Juraj Marcin
Make multifd_send_set_error() take ownership of the error always. Paving
way for making migrate_set_error() to take ownership.
When at it, rename it to multifd_send_error_propagate() to imply the
ownership transition following Error API's naming style.
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/multifd.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/migration/multifd.c b/migration/multifd.c
index 3203dc98e1..651ea3d14b 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -414,7 +414,7 @@ bool multifd_send(MultiFDSendData **send_data)
}
/* Multifd send side hit an error; remember it and prepare to quit */
-static void multifd_send_set_error(Error *err)
+static void multifd_send_error_propagate(Error *err)
{
/*
* We don't want to exit each threads twice. Depending on where
@@ -429,6 +429,7 @@ static void multifd_send_set_error(Error *err)
if (err) {
MigrationState *s = migrate_get_current();
migrate_set_error(s, err);
+ error_free(err);
if (s->state == MIGRATION_STATUS_SETUP ||
s->state == MIGRATION_STATUS_PRE_SWITCHOVER ||
s->state == MIGRATION_STATUS_DEVICE ||
@@ -777,9 +778,8 @@ out:
if (ret) {
assert(local_err);
trace_multifd_send_error(p->id);
- multifd_send_set_error(local_err);
+ multifd_send_error_propagate(local_err);
multifd_send_kick_main(p);
- error_free(local_err);
}
rcu_unregister_thread();
@@ -901,14 +901,13 @@ out:
}
trace_multifd_new_send_channel_async_error(p->id, local_err);
- multifd_send_set_error(local_err);
+ multifd_send_error_propagate(local_err);
/*
* For error cases (TLS or non-TLS), IO channel is always freed here
* rather than when cleanup multifd: since p->c is not set, multifd
* cleanup code doesn't even know its existence.
*/
object_unref(OBJECT(ioc));
- error_free(local_err);
}
static bool multifd_new_send_channel_create(gpointer opaque, Error **errp)
--
2.50.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH for-11.0 v2 6/7] migration: Make multifd_recv_terminate_threads() own the error
2025-12-01 19:45 [PATCH for-11.0 v2 0/7] migration: Error reporting cleanups Peter Xu
` (4 preceding siblings ...)
2025-12-01 19:45 ` [PATCH for-11.0 v2 5/7] migration: Make multifd_send_set_error() " Peter Xu
@ 2025-12-01 19:45 ` Peter Xu
2025-12-02 11:44 ` Markus Armbruster
2025-12-01 19:45 ` [PATCH for-11.0 v2 7/7] migration: Replace migrate_set_error() with migrate_error_propagate() Peter Xu
` (3 subsequent siblings)
9 siblings, 1 reply; 18+ messages in thread
From: Peter Xu @ 2025-12-01 19:45 UTC (permalink / raw)
To: qemu-devel
Cc: Daniel P . Berrangé, Fabiano Rosas,
Vladimir Sementsov-Ogievskiy, Markus Armbruster,
Marc-André Lureau, peterx, Juraj Marcin
Make multifd_recv_terminate_threads() take ownership of the error always.
Paving way for making migrate_set_error() to take ownership.
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/multifd.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/migration/multifd.c b/migration/multifd.c
index 651ea3d14b..52e4d25857 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -1068,6 +1068,7 @@ static void multifd_recv_terminate_threads(Error *err)
if (err) {
MigrationState *s = migrate_get_current();
migrate_set_error(s, err);
+ error_free(err);
if (s->state == MIGRATION_STATUS_SETUP ||
s->state == MIGRATION_STATUS_ACTIVE) {
migrate_set_state(&s->state, s->state,
@@ -1434,7 +1435,6 @@ static void *multifd_recv_thread(void *opaque)
if (local_err) {
multifd_recv_terminate_threads(local_err);
- error_free(local_err);
}
rcu_unregister_thread();
@@ -1535,7 +1535,7 @@ void multifd_recv_new_channel(QIOChannel *ioc, Error **errp)
if (use_packets) {
id = multifd_recv_initial_packet(ioc, &local_err);
if (id < 0) {
- multifd_recv_terminate_threads(local_err);
+ multifd_recv_terminate_threads(error_copy(local_err));
error_propagate_prepend(errp, local_err,
"failed to receive packet"
" via multifd channel %d: ",
@@ -1551,7 +1551,7 @@ void multifd_recv_new_channel(QIOChannel *ioc, Error **errp)
if (p->c != NULL) {
error_setg(&local_err, "multifd: received id '%d' already setup'",
id);
- multifd_recv_terminate_threads(local_err);
+ multifd_recv_terminate_threads(error_copy(local_err));
error_propagate(errp, local_err);
return;
}
--
2.50.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH for-11.0 v2 7/7] migration: Replace migrate_set_error() with migrate_error_propagate()
2025-12-01 19:45 [PATCH for-11.0 v2 0/7] migration: Error reporting cleanups Peter Xu
` (5 preceding siblings ...)
2025-12-01 19:45 ` [PATCH for-11.0 v2 6/7] migration: Make multifd_recv_terminate_threads() " Peter Xu
@ 2025-12-01 19:45 ` Peter Xu
2025-12-02 11:59 ` Markus Armbruster
2025-12-02 13:55 ` [PATCH for-11.0 v2 0/7] migration: Error reporting cleanups Fabiano Rosas
` (2 subsequent siblings)
9 siblings, 1 reply; 18+ messages in thread
From: Peter Xu @ 2025-12-01 19:45 UTC (permalink / raw)
To: qemu-devel
Cc: Daniel P . Berrangé, Fabiano Rosas,
Vladimir Sementsov-Ogievskiy, Markus Armbruster,
Marc-André Lureau, peterx, Juraj Marcin
migrate_set_error() currently doesn't take ownership of the error being
passed in. It's not aligned with the error API and meanwhile it also
makes most of the caller free the error explicitly.
Change the API to take the ownership of the Error object instead. This
should save a lot of error_copy() invocations.
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/migration.h | 2 +-
migration/cpr-exec.c | 5 ++--
migration/migration.c | 44 +++++++++++++++-----------------
migration/multifd-device-state.c | 5 +---
migration/multifd.c | 19 +++++++-------
migration/postcopy-ram.c | 5 ++--
migration/ram.c | 4 +--
migration/savevm.c | 16 +++++-------
8 files changed, 43 insertions(+), 57 deletions(-)
diff --git a/migration/migration.h b/migration/migration.h
index 213b33fe6e..e4b4f25deb 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -525,7 +525,7 @@ void migration_incoming_process(void);
bool migration_has_all_channels(void);
-void migrate_set_error(MigrationState *s, const Error *error);
+void migrate_error_propagate(MigrationState *s, Error *error);
bool migrate_has_error(MigrationState *s);
void migration_connect(MigrationState *s, Error *error_in);
diff --git a/migration/cpr-exec.c b/migration/cpr-exec.c
index 0b8344a86f..da287d8031 100644
--- a/migration/cpr-exec.c
+++ b/migration/cpr-exec.c
@@ -158,8 +158,9 @@ static void cpr_exec_cb(void *opaque)
error_report_err(error_copy(err));
migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED);
- migrate_set_error(s, err);
- error_free(err);
+
+ migrate_error_propagate(s, err);
+ /* We must reset the error because it'll be reused later */
err = NULL;
/* Note, we can go from state COMPLETED to FAILED */
diff --git a/migration/migration.c b/migration/migration.c
index 0ff8b31a88..70813e5006 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -914,9 +914,7 @@ process_incoming_migration_co(void *opaque)
fail:
migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
MIGRATION_STATUS_FAILED);
- migrate_set_error(s, local_err);
- error_free(local_err);
-
+ migrate_error_propagate(s, local_err);
migration_incoming_state_destroy();
if (mis->exit_on_error) {
@@ -1548,14 +1546,20 @@ static void migration_cleanup_bh(void *opaque)
migration_cleanup(opaque);
}
-void migrate_set_error(MigrationState *s, const Error *error)
+/*
+ * Propagate the Error* object to migration core. The caller mustn't
+ * reference the error pointer after the function returned, because the
+ * Error* object might be freed.
+ */
+void migrate_error_propagate(MigrationState *s, Error *error)
{
QEMU_LOCK_GUARD(&s->error_mutex);
-
trace_migrate_error(error_get_pretty(error));
if (!s->error) {
- s->error = error_copy(error);
+ s->error = error;
+ } else {
+ error_free(error);
}
}
@@ -1601,8 +1605,7 @@ static void migration_connect_error_propagate(MigrationState *s, Error *error)
}
migrate_set_state(&s->state, current, next);
- migrate_set_error(s, error);
- error_free(error);
+ migrate_error_propagate(s, error);
}
void migration_cancel(void)
@@ -2014,8 +2017,7 @@ void qmp_migrate_pause(Error **errp)
/* Tell the core migration that we're pausing */
error_setg(&error, "Postcopy migration is paused by the user");
- migrate_set_error(ms, error);
- error_free(error);
+ migrate_error_propagate(ms, error);
qemu_mutex_lock(&ms->qemu_file_lock);
if (ms->to_dst_file) {
@@ -2647,8 +2649,7 @@ static void *source_return_path_thread(void *opaque)
out:
if (err) {
- migrate_set_error(ms, err);
- error_free(err);
+ migrate_error_propagate(ms, err);
trace_source_return_path_thread_bad_end();
}
@@ -3094,12 +3095,10 @@ static void migration_completion(MigrationState *s)
fail:
if (qemu_file_get_error_obj(s->to_dst_file, &local_err)) {
- migrate_set_error(s, local_err);
- error_free(local_err);
+ migrate_error_propagate(s, local_err);
} else if (ret) {
error_setg_errno(&local_err, -ret, "Error in migration completion");
- migrate_set_error(s, local_err);
- error_free(local_err);
+ migrate_error_propagate(s, local_err);
}
if (s->state != MIGRATION_STATUS_CANCELLING) {
@@ -3326,8 +3325,7 @@ static MigThrError migration_detect_error(MigrationState *s)
}
if (local_error) {
- migrate_set_error(s, local_error);
- error_free(local_error);
+ migrate_error_propagate(s, local_error);
}
if (state == MIGRATION_STATUS_POSTCOPY_ACTIVE && ret) {
@@ -3522,7 +3520,7 @@ static MigIterateState migration_iteration_run(MigrationState *s)
if (must_precopy <= s->threshold_size &&
can_switchover && qatomic_read(&s->start_postcopy)) {
if (postcopy_start(s, &local_err)) {
- migrate_set_error(s, local_err);
+ migrate_error_propagate(s, error_copy(local_err));
error_report_err(local_err);
}
return MIG_ITERATE_SKIP;
@@ -3819,8 +3817,7 @@ static void *migration_thread(void *opaque)
* devices to unplug. This to preserve migration state transitions.
*/
if (ret) {
- migrate_set_error(s, local_err);
- error_free(local_err);
+ migrate_error_propagate(s, local_err);
migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
MIGRATION_STATUS_FAILED);
goto out;
@@ -3944,8 +3941,7 @@ static void *bg_migration_thread(void *opaque)
* devices to unplug. This to preserve migration state transitions.
*/
if (ret) {
- migrate_set_error(s, local_err);
- error_free(local_err);
+ migrate_error_propagate(s, local_err);
migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
MIGRATION_STATUS_FAILED);
goto fail_setup;
@@ -4127,7 +4123,7 @@ void migration_connect(MigrationState *s, Error *error_in)
return;
fail:
- migrate_set_error(s, local_err);
+ migrate_error_propagate(s, error_copy(local_err));
if (s->state != MIGRATION_STATUS_CANCELLING) {
migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED);
}
diff --git a/migration/multifd-device-state.c b/migration/multifd-device-state.c
index db3239fef5..91d5d81556 100644
--- a/migration/multifd-device-state.c
+++ b/migration/multifd-device-state.c
@@ -143,8 +143,6 @@ static int multifd_device_state_save_thread(void *opaque)
Error *local_err = NULL;
if (!data->hdlr(data, &local_err)) {
- MigrationState *s = migrate_get_current();
-
/*
* Can't call abort_device_state_save_threads() here since new
* save threads could still be in process of being launched
@@ -158,8 +156,7 @@ static int multifd_device_state_save_thread(void *opaque)
* In case of multiple save threads failing which thread error
* return we end setting is purely arbitrary.
*/
- migrate_set_error(s, local_err);
- error_free(local_err);
+ migrate_error_propagate(migrate_get_current(), local_err);
}
return 0;
diff --git a/migration/multifd.c b/migration/multifd.c
index 52e4d25857..bf6da85af8 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -428,8 +428,9 @@ static void multifd_send_error_propagate(Error *err)
if (err) {
MigrationState *s = migrate_get_current();
- migrate_set_error(s, err);
- error_free(err);
+
+ migrate_error_propagate(s, err);
+
if (s->state == MIGRATION_STATUS_SETUP ||
s->state == MIGRATION_STATUS_PRE_SWITCHOVER ||
s->state == MIGRATION_STATUS_DEVICE ||
@@ -588,8 +589,7 @@ void multifd_send_shutdown(void)
Error *local_err = NULL;
if (!multifd_send_cleanup_channel(p, &local_err)) {
- migrate_set_error(migrate_get_current(), local_err);
- error_free(local_err);
+ migrate_error_propagate(migrate_get_current(), local_err);
}
}
@@ -962,8 +962,7 @@ bool multifd_send_setup(void)
p->write_flags = 0;
if (!multifd_new_send_channel_create(p, &local_err)) {
- migrate_set_error(s, local_err);
- error_free(local_err);
+ migrate_error_propagate(s, local_err);
ret = -1;
}
}
@@ -987,8 +986,7 @@ bool multifd_send_setup(void)
ret = multifd_send_state->ops->send_setup(p, &local_err);
if (ret) {
- migrate_set_error(s, local_err);
- error_free(local_err);
+ migrate_error_propagate(s, local_err);
goto err;
}
assert(p->iov);
@@ -1067,8 +1065,9 @@ static void multifd_recv_terminate_threads(Error *err)
if (err) {
MigrationState *s = migrate_get_current();
- migrate_set_error(s, err);
- error_free(err);
+
+ migrate_error_propagate(s, err);
+
if (s->state == MIGRATION_STATUS_SETUP ||
s->state == MIGRATION_STATUS_ACTIVE) {
migrate_set_state(&s->state, s->state,
diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
index 7c9fe61041..dd8e31f5ae 100644
--- a/migration/postcopy-ram.c
+++ b/migration/postcopy-ram.c
@@ -1927,8 +1927,7 @@ postcopy_preempt_send_channel_done(MigrationState *s,
QIOChannel *ioc, Error *local_err)
{
if (local_err) {
- migrate_set_error(s, local_err);
- error_free(local_err);
+ migrate_error_propagate(s, local_err);
} else {
migration_ioc_register_yank(ioc);
s->postcopy_qemufile_src = qemu_file_new_output(ioc);
@@ -2162,7 +2161,7 @@ static void *postcopy_listen_thread(void *opaque)
* exit depending on if postcopy-exit-on-error is true, but the
* migration cannot be recovered.
*/
- migrate_set_error(migr, local_err);
+ migrate_error_propagate(migr, error_copy(local_err));
error_report_err(local_err);
migrate_set_state(&mis->state, mis->state, MIGRATION_STATUS_FAILED);
goto out;
diff --git a/migration/ram.c b/migration/ram.c
index 29f016cb25..a2f456d858 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -4730,9 +4730,7 @@ static void ram_mig_ram_block_resized(RAMBlockNotifier *n, void *host,
* Abort and indicate a proper reason.
*/
error_setg(&err, "RAM block '%s' resized during precopy.", rb->idstr);
- migrate_set_error(migrate_get_current(), err);
- error_free(err);
-
+ migrate_error_propagate(migrate_get_current(), err);
migration_cancel();
}
diff --git a/migration/savevm.c b/migration/savevm.c
index 638e9b364f..67f10a5dbe 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1125,13 +1125,12 @@ void qemu_savevm_send_open_return_path(QEMUFile *f)
int qemu_savevm_send_packaged(QEMUFile *f, const uint8_t *buf, size_t len)
{
uint32_t tmp;
- MigrationState *ms = migrate_get_current();
Error *local_err = NULL;
if (len > MAX_VM_CMD_PACKAGED_SIZE) {
error_setg(&local_err, "%s: Unreasonably large packaged state: %zu",
__func__, len);
- migrate_set_error(ms, local_err);
+ migrate_error_propagate(migrate_get_current(), error_copy(local_err));
error_report_err(local_err);
return -1;
}
@@ -1373,7 +1372,7 @@ int qemu_savevm_state_setup(QEMUFile *f, Error **errp)
if (se->vmsd && se->vmsd->early_setup) {
ret = vmstate_save(f, se, vmdesc, errp);
if (ret) {
- migrate_set_error(ms, *errp);
+ migrate_error_propagate(ms, error_copy(*errp));
qemu_file_set_error(f, ret);
break;
}
@@ -1681,7 +1680,7 @@ int qemu_savevm_state_complete_precopy_non_iterable(QEMUFile *f,
ret = vmstate_save(f, se, vmdesc, &local_err);
if (ret) {
- migrate_set_error(ms, local_err);
+ migrate_error_propagate(ms, error_copy(local_err));
error_report_err(local_err);
qemu_file_set_error(f, ret);
return ret;
@@ -1858,7 +1857,6 @@ void qemu_savevm_live_state(QEMUFile *f)
int qemu_save_device_state(QEMUFile *f)
{
- MigrationState *ms = migrate_get_current();
Error *local_err = NULL;
SaveStateEntry *se;
@@ -1876,7 +1874,8 @@ int qemu_save_device_state(QEMUFile *f)
}
ret = vmstate_save(f, se, NULL, &local_err);
if (ret) {
- migrate_set_error(ms, local_err);
+ migrate_error_propagate(migrate_get_current(),
+ error_copy(local_err));
error_report_err(local_err);
return ret;
}
@@ -2826,8 +2825,6 @@ static int qemu_loadvm_load_thread(void *thread_opaque)
Error *local_err = NULL;
if (!data->function(data->opaque, &mis->load_threads_abort, &local_err)) {
- MigrationState *s = migrate_get_current();
-
/*
* Can't set load_threads_abort here since processing of main migration
* channel data could still be happening, resulting in launching of new
@@ -2840,8 +2837,7 @@ static int qemu_loadvm_load_thread(void *thread_opaque)
* In case of multiple load threads failing which thread error
* return we end setting is purely arbitrary.
*/
- migrate_set_error(s, local_err);
- error_free(local_err);
+ migrate_error_propagate(migrate_get_current(), local_err);
}
return 0;
--
2.50.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH for-11.0 v2 2/7] Revert "error: define g_autoptr() cleanup function for the Error type"
2025-12-01 19:45 ` [PATCH for-11.0 v2 2/7] Revert "error: define g_autoptr() cleanup function for the Error type" Peter Xu
@ 2025-12-02 7:53 ` Cédric Le Goater
0 siblings, 0 replies; 18+ messages in thread
From: Cédric Le Goater @ 2025-12-02 7:53 UTC (permalink / raw)
To: Peter Xu, qemu-devel
Cc: Daniel P . Berrangé, Fabiano Rosas,
Vladimir Sementsov-Ogievskiy, Markus Armbruster,
Marc-André Lureau, Juraj Marcin, Maciej S. Szmigiero
On 12/1/25 20:45, Peter Xu wrote:
> This reverts commit 18eb55546a54e443d94a4c49286348176ad4b00a.
>
> Due to the nature of how Error should be used (normally ownership will be
> passed over to Error APIs, like error_report_err), auto-free functions may
> be error prone on its own. The auto cleanup function was merged without
> proper review as pointed out by Dan and Markus:
>
> https://lore.kernel.org/r/aSWSLMi6ZhTCS_p2@redhat.com
>
> Cc: Cédric Le Goater <clg@redhat.com>
> Acked-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
> Reviewed-by: Markus Armbruster <armbru@redhat.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
> include/qapi/error.h | 2 --
> 1 file changed, 2 deletions(-)
>
> diff --git a/include/qapi/error.h b/include/qapi/error.h
> index b16c6303f8..f3ce4a4a2d 100644
> --- a/include/qapi/error.h
> +++ b/include/qapi/error.h
> @@ -437,8 +437,6 @@ Error *error_copy(const Error *err);
> */
> void error_free(Error *err);
>
> -G_DEFINE_AUTOPTR_CLEANUP_FUNC(Error, error_free)
> -
> /*
> * Convenience function to assert that *@errp is set, then silently free it.
> */
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Thanks,
C.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH for-11.0 v2 4/7] migration: Make migration_connect_set_error() own the error
2025-12-01 19:45 ` [PATCH for-11.0 v2 4/7] migration: Make migration_connect_set_error() own the error Peter Xu
@ 2025-12-02 11:42 ` Markus Armbruster
0 siblings, 0 replies; 18+ messages in thread
From: Markus Armbruster @ 2025-12-02 11:42 UTC (permalink / raw)
To: Peter Xu
Cc: qemu-devel, Daniel P . Berrangé, Fabiano Rosas,
Vladimir Sementsov-Ogievskiy, Marc-André Lureau,
Juraj Marcin
Peter Xu <peterx@redhat.com> writes:
> Make migration_connect_set_error() take ownership of the error always.
> Paving way for making migrate_set_error() to take ownership.
>
> When at it, renaming it to migration_connect_error_propagate(), following
> Error API, to imply the Error object ownership transition.
>
> NOTE: this patch also makes migration_connect() to take ownership of the
> Error passed in.
>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
> migration/channel.c | 1 -
> migration/migration.c | 9 +++++----
> 2 files changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/migration/channel.c b/migration/channel.c
> index 462cc183e1..92435fa7f7 100644
> --- a/migration/channel.c
> +++ b/migration/channel.c
> @@ -95,7 +95,6 @@ void migration_channel_connect(MigrationState *s,
> }
> }
> migration_connect(s, error);
> - error_free(error);
> }
>
>
> diff --git a/migration/migration.c b/migration/migration.c
> index b316ee01ab..0ff8b31a88 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1575,7 +1575,7 @@ static void migrate_error_free(MigrationState *s)
> }
> }
>
> -static void migration_connect_set_error(MigrationState *s, const Error *error)
> +static void migration_connect_error_propagate(MigrationState *s, Error *error)
> {
> MigrationStatus current = s->state;
> MigrationStatus next;
> @@ -1602,6 +1602,7 @@ static void migration_connect_set_error(MigrationState *s, const Error *error)
>
> migrate_set_state(&s->state, current, next);
> migrate_set_error(s, error);
> + error_free(error);
> }
>
> void migration_cancel(void)
> @@ -2292,7 +2293,7 @@ void qmp_migrate(const char *uri, bool has_channels,
>
> out:
> if (local_err) {
> - migration_connect_set_error(s, local_err);
> + migration_connect_error_propagate(s, error_copy(local_err));
> error_propagate(errp, local_err);
> }
> }
> @@ -2337,7 +2338,7 @@ static void qmp_migrate_finish(MigrationAddress *addr, bool resume_requested,
> if (!resume_requested) {
> yank_unregister_instance(MIGRATION_YANK_INSTANCE);
> }
> - migration_connect_set_error(s, local_err);
> + migration_connect_error_propagate(s, error_copy(local_err));
> error_propagate(errp, local_err);
> return;
> }
> @@ -4039,7 +4040,7 @@ void migration_connect(MigrationState *s, Error *error_in)
>
> s->expected_downtime = migrate_downtime_limit();
> if (error_in) {
> - migration_connect_set_error(s, error_in);
> + migration_connect_error_propagate(s, error_in);
> if (resume) {
> /*
> * Don't do cleanup for resume if channel is invalid, but only dump
Could be split for slightly easier review: first change
migration_connect(), then migrate_set_error(). I doubt it's worth the
bother now.
Reviewed-by: Markus Armbruster <armbru@redhat.com>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH for-11.0 v2 6/7] migration: Make multifd_recv_terminate_threads() own the error
2025-12-01 19:45 ` [PATCH for-11.0 v2 6/7] migration: Make multifd_recv_terminate_threads() " Peter Xu
@ 2025-12-02 11:44 ` Markus Armbruster
0 siblings, 0 replies; 18+ messages in thread
From: Markus Armbruster @ 2025-12-02 11:44 UTC (permalink / raw)
To: Peter Xu
Cc: qemu-devel, Daniel P . Berrangé, Fabiano Rosas,
Vladimir Sementsov-Ogievskiy, Marc-André Lureau,
Juraj Marcin
Peter Xu <peterx@redhat.com> writes:
> Make multifd_recv_terminate_threads() take ownership of the error always.
> Paving way for making migrate_set_error() to take ownership.
>
> Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH for-11.0 v2 7/7] migration: Replace migrate_set_error() with migrate_error_propagate()
2025-12-01 19:45 ` [PATCH for-11.0 v2 7/7] migration: Replace migrate_set_error() with migrate_error_propagate() Peter Xu
@ 2025-12-02 11:59 ` Markus Armbruster
2025-12-02 17:28 ` Peter Xu
0 siblings, 1 reply; 18+ messages in thread
From: Markus Armbruster @ 2025-12-02 11:59 UTC (permalink / raw)
To: Peter Xu
Cc: qemu-devel, Daniel P . Berrangé, Fabiano Rosas,
Vladimir Sementsov-Ogievskiy, Marc-André Lureau,
Juraj Marcin
Peter Xu <peterx@redhat.com> writes:
> migrate_set_error() currently doesn't take ownership of the error being
> passed in. It's not aligned with the error API and meanwhile it also
> makes most of the caller free the error explicitly.
>
> Change the API to take the ownership of the Error object instead. This
> should save a lot of error_copy() invocations.
>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
> migration/migration.h | 2 +-
> migration/cpr-exec.c | 5 ++--
> migration/migration.c | 44 +++++++++++++++-----------------
> migration/multifd-device-state.c | 5 +---
> migration/multifd.c | 19 +++++++-------
> migration/postcopy-ram.c | 5 ++--
> migration/ram.c | 4 +--
> migration/savevm.c | 16 +++++-------
> 8 files changed, 43 insertions(+), 57 deletions(-)
>
> diff --git a/migration/migration.h b/migration/migration.h
> index 213b33fe6e..e4b4f25deb 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -525,7 +525,7 @@ void migration_incoming_process(void);
>
> bool migration_has_all_channels(void);
>
> -void migrate_set_error(MigrationState *s, const Error *error);
> +void migrate_error_propagate(MigrationState *s, Error *error);
> bool migrate_has_error(MigrationState *s);
>
> void migration_connect(MigrationState *s, Error *error_in);
> diff --git a/migration/cpr-exec.c b/migration/cpr-exec.c
> index 0b8344a86f..da287d8031 100644
> --- a/migration/cpr-exec.c
> +++ b/migration/cpr-exec.c
> @@ -158,8 +158,9 @@ static void cpr_exec_cb(void *opaque)
>
> error_report_err(error_copy(err));
> migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED);
> - migrate_set_error(s, err);
> - error_free(err);
> +
> + migrate_error_propagate(s, err);
> + /* We must reset the error because it'll be reused later */
> err = NULL;
>
> /* Note, we can go from state COMPLETED to FAILED */
> diff --git a/migration/migration.c b/migration/migration.c
> index 0ff8b31a88..70813e5006 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -914,9 +914,7 @@ process_incoming_migration_co(void *opaque)
> fail:
> migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
> MIGRATION_STATUS_FAILED);
> - migrate_set_error(s, local_err);
> - error_free(local_err);
> -
> + migrate_error_propagate(s, local_err);
> migration_incoming_state_destroy();
>
> if (mis->exit_on_error) {
> @@ -1548,14 +1546,20 @@ static void migration_cleanup_bh(void *opaque)
> migration_cleanup(opaque);
> }
>
> -void migrate_set_error(MigrationState *s, const Error *error)
> +/*
> + * Propagate the Error* object to migration core. The caller mustn't
> + * reference the error pointer after the function returned, because the
> + * Error* object might be freed.
> + */
> +void migrate_error_propagate(MigrationState *s, Error *error)
> {
> QEMU_LOCK_GUARD(&s->error_mutex);
> -
> trace_migrate_error(error_get_pretty(error));
>
> if (!s->error) {
> - s->error = error_copy(error);
> + s->error = error;
> + } else {
> + error_free(error);
> }
This is correct. Also correct, and possibly easier to understand at a
glance:
error_propagate(&s->error, error);
error_propagate() has code for a number of additional conditions, but
these are all impossible:
* @error cannot be null here (because error_get_pretty(error) above)
* &s->error cannot be &error_abort, &error_fatal, or null
Thoughts?
> }
>
> @@ -1601,8 +1605,7 @@ static void migration_connect_error_propagate(MigrationState *s, Error *error)
> }
>
> migrate_set_state(&s->state, current, next);
> - migrate_set_error(s, error);
> - error_free(error);
> + migrate_error_propagate(s, error);
> }
>
> void migration_cancel(void)
> @@ -2014,8 +2017,7 @@ void qmp_migrate_pause(Error **errp)
>
> /* Tell the core migration that we're pausing */
> error_setg(&error, "Postcopy migration is paused by the user");
> - migrate_set_error(ms, error);
> - error_free(error);
> + migrate_error_propagate(ms, error);
>
> qemu_mutex_lock(&ms->qemu_file_lock);
> if (ms->to_dst_file) {
> @@ -2647,8 +2649,7 @@ static void *source_return_path_thread(void *opaque)
>
> out:
> if (err) {
> - migrate_set_error(ms, err);
> - error_free(err);
> + migrate_error_propagate(ms, err);
> trace_source_return_path_thread_bad_end();
> }
>
> @@ -3094,12 +3095,10 @@ static void migration_completion(MigrationState *s)
>
> fail:
> if (qemu_file_get_error_obj(s->to_dst_file, &local_err)) {
> - migrate_set_error(s, local_err);
> - error_free(local_err);
> + migrate_error_propagate(s, local_err);
> } else if (ret) {
> error_setg_errno(&local_err, -ret, "Error in migration completion");
> - migrate_set_error(s, local_err);
> - error_free(local_err);
> + migrate_error_propagate(s, local_err);
> }
>
> if (s->state != MIGRATION_STATUS_CANCELLING) {
> @@ -3326,8 +3325,7 @@ static MigThrError migration_detect_error(MigrationState *s)
> }
>
> if (local_error) {
> - migrate_set_error(s, local_error);
> - error_free(local_error);
> + migrate_error_propagate(s, local_error);
> }
>
> if (state == MIGRATION_STATUS_POSTCOPY_ACTIVE && ret) {
> @@ -3522,7 +3520,7 @@ static MigIterateState migration_iteration_run(MigrationState *s)
> if (must_precopy <= s->threshold_size &&
> can_switchover && qatomic_read(&s->start_postcopy)) {
> if (postcopy_start(s, &local_err)) {
> - migrate_set_error(s, local_err);
> + migrate_error_propagate(s, error_copy(local_err));
> error_report_err(local_err);
> }
> return MIG_ITERATE_SKIP;
> @@ -3819,8 +3817,7 @@ static void *migration_thread(void *opaque)
> * devices to unplug. This to preserve migration state transitions.
> */
> if (ret) {
> - migrate_set_error(s, local_err);
> - error_free(local_err);
> + migrate_error_propagate(s, local_err);
> migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
> MIGRATION_STATUS_FAILED);
> goto out;
> @@ -3944,8 +3941,7 @@ static void *bg_migration_thread(void *opaque)
> * devices to unplug. This to preserve migration state transitions.
> */
> if (ret) {
> - migrate_set_error(s, local_err);
> - error_free(local_err);
> + migrate_error_propagate(s, local_err);
> migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
> MIGRATION_STATUS_FAILED);
> goto fail_setup;
> @@ -4127,7 +4123,7 @@ void migration_connect(MigrationState *s, Error *error_in)
> return;
>
> fail:
> - migrate_set_error(s, local_err);
> + migrate_error_propagate(s, error_copy(local_err));
> if (s->state != MIGRATION_STATUS_CANCELLING) {
> migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED);
> }
> diff --git a/migration/multifd-device-state.c b/migration/multifd-device-state.c
> index db3239fef5..91d5d81556 100644
> --- a/migration/multifd-device-state.c
> +++ b/migration/multifd-device-state.c
> @@ -143,8 +143,6 @@ static int multifd_device_state_save_thread(void *opaque)
> Error *local_err = NULL;
>
> if (!data->hdlr(data, &local_err)) {
> - MigrationState *s = migrate_get_current();
> -
> /*
> * Can't call abort_device_state_save_threads() here since new
> * save threads could still be in process of being launched
> @@ -158,8 +156,7 @@ static int multifd_device_state_save_thread(void *opaque)
> * In case of multiple save threads failing which thread error
> * return we end setting is purely arbitrary.
> */
> - migrate_set_error(s, local_err);
> - error_free(local_err);
> + migrate_error_propagate(migrate_get_current(), local_err);
> }
>
> return 0;
> diff --git a/migration/multifd.c b/migration/multifd.c
> index 52e4d25857..bf6da85af8 100644
> --- a/migration/multifd.c
> +++ b/migration/multifd.c
> @@ -428,8 +428,9 @@ static void multifd_send_error_propagate(Error *err)
>
> if (err) {
> MigrationState *s = migrate_get_current();
> - migrate_set_error(s, err);
> - error_free(err);
> +
> + migrate_error_propagate(s, err);
> +
> if (s->state == MIGRATION_STATUS_SETUP ||
> s->state == MIGRATION_STATUS_PRE_SWITCHOVER ||
> s->state == MIGRATION_STATUS_DEVICE ||
> @@ -588,8 +589,7 @@ void multifd_send_shutdown(void)
> Error *local_err = NULL;
>
> if (!multifd_send_cleanup_channel(p, &local_err)) {
> - migrate_set_error(migrate_get_current(), local_err);
> - error_free(local_err);
> + migrate_error_propagate(migrate_get_current(), local_err);
> }
> }
>
> @@ -962,8 +962,7 @@ bool multifd_send_setup(void)
> p->write_flags = 0;
>
> if (!multifd_new_send_channel_create(p, &local_err)) {
> - migrate_set_error(s, local_err);
> - error_free(local_err);
> + migrate_error_propagate(s, local_err);
> ret = -1;
> }
> }
> @@ -987,8 +986,7 @@ bool multifd_send_setup(void)
>
> ret = multifd_send_state->ops->send_setup(p, &local_err);
> if (ret) {
> - migrate_set_error(s, local_err);
> - error_free(local_err);
> + migrate_error_propagate(s, local_err);
> goto err;
> }
> assert(p->iov);
> @@ -1067,8 +1065,9 @@ static void multifd_recv_terminate_threads(Error *err)
>
> if (err) {
> MigrationState *s = migrate_get_current();
> - migrate_set_error(s, err);
> - error_free(err);
> +
> + migrate_error_propagate(s, err);
> +
> if (s->state == MIGRATION_STATUS_SETUP ||
> s->state == MIGRATION_STATUS_ACTIVE) {
> migrate_set_state(&s->state, s->state,
> diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
> index 7c9fe61041..dd8e31f5ae 100644
> --- a/migration/postcopy-ram.c
> +++ b/migration/postcopy-ram.c
> @@ -1927,8 +1927,7 @@ postcopy_preempt_send_channel_done(MigrationState *s,
> QIOChannel *ioc, Error *local_err)
> {
> if (local_err) {
> - migrate_set_error(s, local_err);
> - error_free(local_err);
> + migrate_error_propagate(s, local_err);
> } else {
> migration_ioc_register_yank(ioc);
> s->postcopy_qemufile_src = qemu_file_new_output(ioc);
> @@ -2162,7 +2161,7 @@ static void *postcopy_listen_thread(void *opaque)
> * exit depending on if postcopy-exit-on-error is true, but the
> * migration cannot be recovered.
> */
> - migrate_set_error(migr, local_err);
> + migrate_error_propagate(migr, error_copy(local_err));
> error_report_err(local_err);
> migrate_set_state(&mis->state, mis->state, MIGRATION_STATUS_FAILED);
> goto out;
> diff --git a/migration/ram.c b/migration/ram.c
> index 29f016cb25..a2f456d858 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -4730,9 +4730,7 @@ static void ram_mig_ram_block_resized(RAMBlockNotifier *n, void *host,
> * Abort and indicate a proper reason.
> */
> error_setg(&err, "RAM block '%s' resized during precopy.", rb->idstr);
> - migrate_set_error(migrate_get_current(), err);
> - error_free(err);
> -
> + migrate_error_propagate(migrate_get_current(), err);
> migration_cancel();
> }
>
> diff --git a/migration/savevm.c b/migration/savevm.c
> index 638e9b364f..67f10a5dbe 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -1125,13 +1125,12 @@ void qemu_savevm_send_open_return_path(QEMUFile *f)
> int qemu_savevm_send_packaged(QEMUFile *f, const uint8_t *buf, size_t len)
> {
> uint32_t tmp;
> - MigrationState *ms = migrate_get_current();
> Error *local_err = NULL;
>
> if (len > MAX_VM_CMD_PACKAGED_SIZE) {
> error_setg(&local_err, "%s: Unreasonably large packaged state: %zu",
> __func__, len);
> - migrate_set_error(ms, local_err);
> + migrate_error_propagate(migrate_get_current(), error_copy(local_err));
I'd break this line between the arguments. Matter of taste, up to you.
> error_report_err(local_err);
> return -1;
> }
> @@ -1373,7 +1372,7 @@ int qemu_savevm_state_setup(QEMUFile *f, Error **errp)
> if (se->vmsd && se->vmsd->early_setup) {
> ret = vmstate_save(f, se, vmdesc, errp);
> if (ret) {
> - migrate_set_error(ms, *errp);
> + migrate_error_propagate(ms, error_copy(*errp));
> qemu_file_set_error(f, ret);
> break;
> }
> @@ -1681,7 +1680,7 @@ int qemu_savevm_state_complete_precopy_non_iterable(QEMUFile *f,
>
> ret = vmstate_save(f, se, vmdesc, &local_err);
> if (ret) {
> - migrate_set_error(ms, local_err);
> + migrate_error_propagate(ms, error_copy(local_err));
> error_report_err(local_err);
> qemu_file_set_error(f, ret);
> return ret;
> @@ -1858,7 +1857,6 @@ void qemu_savevm_live_state(QEMUFile *f)
>
> int qemu_save_device_state(QEMUFile *f)
> {
> - MigrationState *ms = migrate_get_current();
> Error *local_err = NULL;
> SaveStateEntry *se;
>
> @@ -1876,7 +1874,8 @@ int qemu_save_device_state(QEMUFile *f)
> }
> ret = vmstate_save(f, se, NULL, &local_err);
> if (ret) {
> - migrate_set_error(ms, local_err);
> + migrate_error_propagate(migrate_get_current(),
> + error_copy(local_err));
> error_report_err(local_err);
> return ret;
> }
> @@ -2826,8 +2825,6 @@ static int qemu_loadvm_load_thread(void *thread_opaque)
> Error *local_err = NULL;
>
> if (!data->function(data->opaque, &mis->load_threads_abort, &local_err)) {
> - MigrationState *s = migrate_get_current();
> -
> /*
> * Can't set load_threads_abort here since processing of main migration
> * channel data could still be happening, resulting in launching of new
> @@ -2840,8 +2837,7 @@ static int qemu_loadvm_load_thread(void *thread_opaque)
> * In case of multiple load threads failing which thread error
> * return we end setting is purely arbitrary.
> */
> - migrate_set_error(s, local_err);
> - error_free(local_err);
> + migrate_error_propagate(migrate_get_current(), local_err);
> }
>
> return 0;
Reviewed-by: Markus Armbruster <armbru@redhat.com>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH for-11.0 v2 0/7] migration: Error reporting cleanups
2025-12-01 19:45 [PATCH for-11.0 v2 0/7] migration: Error reporting cleanups Peter Xu
` (6 preceding siblings ...)
2025-12-01 19:45 ` [PATCH for-11.0 v2 7/7] migration: Replace migrate_set_error() with migrate_error_propagate() Peter Xu
@ 2025-12-02 13:55 ` Fabiano Rosas
2025-12-02 17:47 ` Peter Xu
2025-12-02 17:53 ` [PATCH for-11.0 v2 8/7] migration: Use error_propagate() in migrate_error_propagate() Peter Xu
2025-12-03 14:33 ` [PATCH for-11.0 v2 0/7] migration: Error reporting cleanups Peter Xu
9 siblings, 1 reply; 18+ messages in thread
From: Fabiano Rosas @ 2025-12-02 13:55 UTC (permalink / raw)
To: Peter Xu, qemu-devel
Cc: Daniel P . Berrangé, Vladimir Sementsov-Ogievskiy,
Markus Armbruster, Marc-André Lureau, peterx, Juraj Marcin
Peter Xu <peterx@redhat.com> writes:
> Based-on: <20251125070554.2256181-1-armbru@redhat.com>
>
> This series is based on Markus's recent fix:
>
> [PATCH] migration: Fix double-free on error path
> https://lore.kernel.org/r/20251125070554.2256181-1-armbru@redhat.com
>
> v2:
> - Added R-bs
> - Patch 1:
> - update commit message on s/accidentally merged/merged without proper
> review/ [Markus]
> - Patch 2:
> - Added a new follow up patch here from Markus to poison Error's autoptr
> - Patch 3:
> - Rename migration_connect_set_error to migration_connect_error_propagate
> [Markus]
> - Add comments in commit log for both migrate_connect() and the rename
> [Markus]
> - Patch 4:
> - Rename multifd_send_set_error to multifd_send_error_propagate [Markus]
> - Patch 6:
> - Make migrate_error_propagate() take MigrationState* as before [Markus]
> - Remove the one use case of g_clear_pointer() [Markus]
> - Touch up commit message for the change
>
> This series should address the issues discussed in this thread here:
>
> https://lore.kernel.org/r/871plmk1bc.fsf@pond.sub.org
Thank you Markus for this. It's very helpful to have someone keeping us
in check regarding the usage of generic QEMU interfaces. Migration code
tends to drift incredibly..
>
> The problem is Error is not a good candidate of g_autoptr, however the
> cleanup function was merged without enough review. Luckily, we only have
> two users so far (after Markus's patch above lands). This series removes
> the last two in migration code and reverts the auto cleanup function for
> Error. Instead, poison the auto cleanup function.
>
> When at it, it'll also change migrate_set_error() to start taking ownership
> of errors, just like what most error APIs do. When at it, it is renamed to
> migrate_error_propagate() to imply migration version of error_propagate().
>
> Comments welcomed, thanks.
>
I think with this series we could now work to reduce the complexity of
migration_connect():
The outgoing code in socket.c and tls.c could call
migration_connect_error_propagate directly so migration_channel_connect
only needs to check migrate_has_error() and then exit as early as
possible. From migration_connect onwards we can assume connection
success.
What do you think?
tangent:
(is it too much bikeshedding if I send a patch doing s/migrat*_/mig_/
all over the place? it's so annoying having to check the code to get the
prefix correct when writing emails)
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH for-11.0 v2 7/7] migration: Replace migrate_set_error() with migrate_error_propagate()
2025-12-02 11:59 ` Markus Armbruster
@ 2025-12-02 17:28 ` Peter Xu
0 siblings, 0 replies; 18+ messages in thread
From: Peter Xu @ 2025-12-02 17:28 UTC (permalink / raw)
To: Markus Armbruster
Cc: qemu-devel, Daniel P . Berrangé, Fabiano Rosas,
Vladimir Sementsov-Ogievskiy, Marc-André Lureau,
Juraj Marcin
On Tue, Dec 02, 2025 at 12:59:40PM +0100, Markus Armbruster wrote:
> Peter Xu <peterx@redhat.com> writes:
>
> > migrate_set_error() currently doesn't take ownership of the error being
> > passed in. It's not aligned with the error API and meanwhile it also
> > makes most of the caller free the error explicitly.
> >
> > Change the API to take the ownership of the Error object instead. This
> > should save a lot of error_copy() invocations.
> >
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> > migration/migration.h | 2 +-
> > migration/cpr-exec.c | 5 ++--
> > migration/migration.c | 44 +++++++++++++++-----------------
> > migration/multifd-device-state.c | 5 +---
> > migration/multifd.c | 19 +++++++-------
> > migration/postcopy-ram.c | 5 ++--
> > migration/ram.c | 4 +--
> > migration/savevm.c | 16 +++++-------
> > 8 files changed, 43 insertions(+), 57 deletions(-)
> >
> > diff --git a/migration/migration.h b/migration/migration.h
> > index 213b33fe6e..e4b4f25deb 100644
> > --- a/migration/migration.h
> > +++ b/migration/migration.h
> > @@ -525,7 +525,7 @@ void migration_incoming_process(void);
> >
> > bool migration_has_all_channels(void);
> >
> > -void migrate_set_error(MigrationState *s, const Error *error);
> > +void migrate_error_propagate(MigrationState *s, Error *error);
> > bool migrate_has_error(MigrationState *s);
> >
> > void migration_connect(MigrationState *s, Error *error_in);
> > diff --git a/migration/cpr-exec.c b/migration/cpr-exec.c
> > index 0b8344a86f..da287d8031 100644
> > --- a/migration/cpr-exec.c
> > +++ b/migration/cpr-exec.c
> > @@ -158,8 +158,9 @@ static void cpr_exec_cb(void *opaque)
> >
> > error_report_err(error_copy(err));
> > migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED);
> > - migrate_set_error(s, err);
> > - error_free(err);
> > +
> > + migrate_error_propagate(s, err);
> > + /* We must reset the error because it'll be reused later */
> > err = NULL;
> >
> > /* Note, we can go from state COMPLETED to FAILED */
> > diff --git a/migration/migration.c b/migration/migration.c
> > index 0ff8b31a88..70813e5006 100644
> > --- a/migration/migration.c
> > +++ b/migration/migration.c
> > @@ -914,9 +914,7 @@ process_incoming_migration_co(void *opaque)
> > fail:
> > migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
> > MIGRATION_STATUS_FAILED);
> > - migrate_set_error(s, local_err);
> > - error_free(local_err);
> > -
> > + migrate_error_propagate(s, local_err);
> > migration_incoming_state_destroy();
> >
> > if (mis->exit_on_error) {
> > @@ -1548,14 +1546,20 @@ static void migration_cleanup_bh(void *opaque)
> > migration_cleanup(opaque);
> > }
> >
> > -void migrate_set_error(MigrationState *s, const Error *error)
> > +/*
> > + * Propagate the Error* object to migration core. The caller mustn't
> > + * reference the error pointer after the function returned, because the
> > + * Error* object might be freed.
> > + */
> > +void migrate_error_propagate(MigrationState *s, Error *error)
> > {
> > QEMU_LOCK_GUARD(&s->error_mutex);
> > -
> > trace_migrate_error(error_get_pretty(error));
> >
> > if (!s->error) {
> > - s->error = error_copy(error);
> > + s->error = error;
> > + } else {
> > + error_free(error);
> > }
>
> This is correct. Also correct, and possibly easier to understand at a
> glance:
>
> error_propagate(&s->error, error);
>
> error_propagate() has code for a number of additional conditions, but
> these are all impossible:
>
> * @error cannot be null here (because error_get_pretty(error) above)
>
> * &s->error cannot be &error_abort, &error_fatal, or null
>
> Thoughts?
Yes it looks more readable indeed, thanks for the suggestion.
Since the whole series got almost reviewed, instead of reposting everything
I'll send a small follow up patch soon as a separate cleanup.
>
> > }
> >
> > @@ -1601,8 +1605,7 @@ static void migration_connect_error_propagate(MigrationState *s, Error *error)
> > }
> >
> > migrate_set_state(&s->state, current, next);
> > - migrate_set_error(s, error);
> > - error_free(error);
> > + migrate_error_propagate(s, error);
> > }
> >
> > void migration_cancel(void)
> > @@ -2014,8 +2017,7 @@ void qmp_migrate_pause(Error **errp)
> >
> > /* Tell the core migration that we're pausing */
> > error_setg(&error, "Postcopy migration is paused by the user");
> > - migrate_set_error(ms, error);
> > - error_free(error);
> > + migrate_error_propagate(ms, error);
> >
> > qemu_mutex_lock(&ms->qemu_file_lock);
> > if (ms->to_dst_file) {
> > @@ -2647,8 +2649,7 @@ static void *source_return_path_thread(void *opaque)
> >
> > out:
> > if (err) {
> > - migrate_set_error(ms, err);
> > - error_free(err);
> > + migrate_error_propagate(ms, err);
> > trace_source_return_path_thread_bad_end();
> > }
> >
> > @@ -3094,12 +3095,10 @@ static void migration_completion(MigrationState *s)
> >
> > fail:
> > if (qemu_file_get_error_obj(s->to_dst_file, &local_err)) {
> > - migrate_set_error(s, local_err);
> > - error_free(local_err);
> > + migrate_error_propagate(s, local_err);
> > } else if (ret) {
> > error_setg_errno(&local_err, -ret, "Error in migration completion");
> > - migrate_set_error(s, local_err);
> > - error_free(local_err);
> > + migrate_error_propagate(s, local_err);
> > }
> >
> > if (s->state != MIGRATION_STATUS_CANCELLING) {
> > @@ -3326,8 +3325,7 @@ static MigThrError migration_detect_error(MigrationState *s)
> > }
> >
> > if (local_error) {
> > - migrate_set_error(s, local_error);
> > - error_free(local_error);
> > + migrate_error_propagate(s, local_error);
> > }
> >
> > if (state == MIGRATION_STATUS_POSTCOPY_ACTIVE && ret) {
> > @@ -3522,7 +3520,7 @@ static MigIterateState migration_iteration_run(MigrationState *s)
> > if (must_precopy <= s->threshold_size &&
> > can_switchover && qatomic_read(&s->start_postcopy)) {
> > if (postcopy_start(s, &local_err)) {
> > - migrate_set_error(s, local_err);
> > + migrate_error_propagate(s, error_copy(local_err));
> > error_report_err(local_err);
> > }
> > return MIG_ITERATE_SKIP;
> > @@ -3819,8 +3817,7 @@ static void *migration_thread(void *opaque)
> > * devices to unplug. This to preserve migration state transitions.
> > */
> > if (ret) {
> > - migrate_set_error(s, local_err);
> > - error_free(local_err);
> > + migrate_error_propagate(s, local_err);
> > migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
> > MIGRATION_STATUS_FAILED);
> > goto out;
> > @@ -3944,8 +3941,7 @@ static void *bg_migration_thread(void *opaque)
> > * devices to unplug. This to preserve migration state transitions.
> > */
> > if (ret) {
> > - migrate_set_error(s, local_err);
> > - error_free(local_err);
> > + migrate_error_propagate(s, local_err);
> > migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
> > MIGRATION_STATUS_FAILED);
> > goto fail_setup;
> > @@ -4127,7 +4123,7 @@ void migration_connect(MigrationState *s, Error *error_in)
> > return;
> >
> > fail:
> > - migrate_set_error(s, local_err);
> > + migrate_error_propagate(s, error_copy(local_err));
> > if (s->state != MIGRATION_STATUS_CANCELLING) {
> > migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED);
> > }
> > diff --git a/migration/multifd-device-state.c b/migration/multifd-device-state.c
> > index db3239fef5..91d5d81556 100644
> > --- a/migration/multifd-device-state.c
> > +++ b/migration/multifd-device-state.c
> > @@ -143,8 +143,6 @@ static int multifd_device_state_save_thread(void *opaque)
> > Error *local_err = NULL;
> >
> > if (!data->hdlr(data, &local_err)) {
> > - MigrationState *s = migrate_get_current();
> > -
> > /*
> > * Can't call abort_device_state_save_threads() here since new
> > * save threads could still be in process of being launched
> > @@ -158,8 +156,7 @@ static int multifd_device_state_save_thread(void *opaque)
> > * In case of multiple save threads failing which thread error
> > * return we end setting is purely arbitrary.
> > */
> > - migrate_set_error(s, local_err);
> > - error_free(local_err);
> > + migrate_error_propagate(migrate_get_current(), local_err);
> > }
> >
> > return 0;
> > diff --git a/migration/multifd.c b/migration/multifd.c
> > index 52e4d25857..bf6da85af8 100644
> > --- a/migration/multifd.c
> > +++ b/migration/multifd.c
> > @@ -428,8 +428,9 @@ static void multifd_send_error_propagate(Error *err)
> >
> > if (err) {
> > MigrationState *s = migrate_get_current();
> > - migrate_set_error(s, err);
> > - error_free(err);
> > +
> > + migrate_error_propagate(s, err);
> > +
> > if (s->state == MIGRATION_STATUS_SETUP ||
> > s->state == MIGRATION_STATUS_PRE_SWITCHOVER ||
> > s->state == MIGRATION_STATUS_DEVICE ||
> > @@ -588,8 +589,7 @@ void multifd_send_shutdown(void)
> > Error *local_err = NULL;
> >
> > if (!multifd_send_cleanup_channel(p, &local_err)) {
> > - migrate_set_error(migrate_get_current(), local_err);
> > - error_free(local_err);
> > + migrate_error_propagate(migrate_get_current(), local_err);
> > }
> > }
> >
> > @@ -962,8 +962,7 @@ bool multifd_send_setup(void)
> > p->write_flags = 0;
> >
> > if (!multifd_new_send_channel_create(p, &local_err)) {
> > - migrate_set_error(s, local_err);
> > - error_free(local_err);
> > + migrate_error_propagate(s, local_err);
> > ret = -1;
> > }
> > }
> > @@ -987,8 +986,7 @@ bool multifd_send_setup(void)
> >
> > ret = multifd_send_state->ops->send_setup(p, &local_err);
> > if (ret) {
> > - migrate_set_error(s, local_err);
> > - error_free(local_err);
> > + migrate_error_propagate(s, local_err);
> > goto err;
> > }
> > assert(p->iov);
> > @@ -1067,8 +1065,9 @@ static void multifd_recv_terminate_threads(Error *err)
> >
> > if (err) {
> > MigrationState *s = migrate_get_current();
> > - migrate_set_error(s, err);
> > - error_free(err);
> > +
> > + migrate_error_propagate(s, err);
> > +
> > if (s->state == MIGRATION_STATUS_SETUP ||
> > s->state == MIGRATION_STATUS_ACTIVE) {
> > migrate_set_state(&s->state, s->state,
> > diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
> > index 7c9fe61041..dd8e31f5ae 100644
> > --- a/migration/postcopy-ram.c
> > +++ b/migration/postcopy-ram.c
> > @@ -1927,8 +1927,7 @@ postcopy_preempt_send_channel_done(MigrationState *s,
> > QIOChannel *ioc, Error *local_err)
> > {
> > if (local_err) {
> > - migrate_set_error(s, local_err);
> > - error_free(local_err);
> > + migrate_error_propagate(s, local_err);
> > } else {
> > migration_ioc_register_yank(ioc);
> > s->postcopy_qemufile_src = qemu_file_new_output(ioc);
> > @@ -2162,7 +2161,7 @@ static void *postcopy_listen_thread(void *opaque)
> > * exit depending on if postcopy-exit-on-error is true, but the
> > * migration cannot be recovered.
> > */
> > - migrate_set_error(migr, local_err);
> > + migrate_error_propagate(migr, error_copy(local_err));
> > error_report_err(local_err);
> > migrate_set_state(&mis->state, mis->state, MIGRATION_STATUS_FAILED);
> > goto out;
> > diff --git a/migration/ram.c b/migration/ram.c
> > index 29f016cb25..a2f456d858 100644
> > --- a/migration/ram.c
> > +++ b/migration/ram.c
> > @@ -4730,9 +4730,7 @@ static void ram_mig_ram_block_resized(RAMBlockNotifier *n, void *host,
> > * Abort and indicate a proper reason.
> > */
> > error_setg(&err, "RAM block '%s' resized during precopy.", rb->idstr);
> > - migrate_set_error(migrate_get_current(), err);
> > - error_free(err);
> > -
> > + migrate_error_propagate(migrate_get_current(), err);
> > migration_cancel();
> > }
> >
> > diff --git a/migration/savevm.c b/migration/savevm.c
> > index 638e9b364f..67f10a5dbe 100644
> > --- a/migration/savevm.c
> > +++ b/migration/savevm.c
> > @@ -1125,13 +1125,12 @@ void qemu_savevm_send_open_return_path(QEMUFile *f)
> > int qemu_savevm_send_packaged(QEMUFile *f, const uint8_t *buf, size_t len)
> > {
> > uint32_t tmp;
> > - MigrationState *ms = migrate_get_current();
> > Error *local_err = NULL;
> >
> > if (len > MAX_VM_CMD_PACKAGED_SIZE) {
> > error_setg(&local_err, "%s: Unreasonably large packaged state: %zu",
> > __func__, len);
> > - migrate_set_error(ms, local_err);
> > + migrate_error_propagate(migrate_get_current(), error_copy(local_err));
>
> I'd break this line between the arguments. Matter of taste, up to you.
Would you mind if I queue this series together with patch 2+3? If so, I
can touch it up when queuing.
>
> > error_report_err(local_err);
> > return -1;
> > }
> > @@ -1373,7 +1372,7 @@ int qemu_savevm_state_setup(QEMUFile *f, Error **errp)
> > if (se->vmsd && se->vmsd->early_setup) {
> > ret = vmstate_save(f, se, vmdesc, errp);
> > if (ret) {
> > - migrate_set_error(ms, *errp);
> > + migrate_error_propagate(ms, error_copy(*errp));
> > qemu_file_set_error(f, ret);
> > break;
> > }
> > @@ -1681,7 +1680,7 @@ int qemu_savevm_state_complete_precopy_non_iterable(QEMUFile *f,
> >
> > ret = vmstate_save(f, se, vmdesc, &local_err);
> > if (ret) {
> > - migrate_set_error(ms, local_err);
> > + migrate_error_propagate(ms, error_copy(local_err));
> > error_report_err(local_err);
> > qemu_file_set_error(f, ret);
> > return ret;
> > @@ -1858,7 +1857,6 @@ void qemu_savevm_live_state(QEMUFile *f)
> >
> > int qemu_save_device_state(QEMUFile *f)
> > {
> > - MigrationState *ms = migrate_get_current();
> > Error *local_err = NULL;
> > SaveStateEntry *se;
> >
> > @@ -1876,7 +1874,8 @@ int qemu_save_device_state(QEMUFile *f)
> > }
> > ret = vmstate_save(f, se, NULL, &local_err);
> > if (ret) {
> > - migrate_set_error(ms, local_err);
> > + migrate_error_propagate(migrate_get_current(),
> > + error_copy(local_err));
> > error_report_err(local_err);
> > return ret;
> > }
> > @@ -2826,8 +2825,6 @@ static int qemu_loadvm_load_thread(void *thread_opaque)
> > Error *local_err = NULL;
> >
> > if (!data->function(data->opaque, &mis->load_threads_abort, &local_err)) {
> > - MigrationState *s = migrate_get_current();
> > -
> > /*
> > * Can't set load_threads_abort here since processing of main migration
> > * channel data could still be happening, resulting in launching of new
> > @@ -2840,8 +2837,7 @@ static int qemu_loadvm_load_thread(void *thread_opaque)
> > * In case of multiple load threads failing which thread error
> > * return we end setting is purely arbitrary.
> > */
> > - migrate_set_error(s, local_err);
> > - error_free(local_err);
> > + migrate_error_propagate(migrate_get_current(), local_err);
> > }
> >
> > return 0;
>
> Reviewed-by: Markus Armbruster <armbru@redhat.com>
Thanks!
--
Peter Xu
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH for-11.0 v2 0/7] migration: Error reporting cleanups
2025-12-02 13:55 ` [PATCH for-11.0 v2 0/7] migration: Error reporting cleanups Fabiano Rosas
@ 2025-12-02 17:47 ` Peter Xu
0 siblings, 0 replies; 18+ messages in thread
From: Peter Xu @ 2025-12-02 17:47 UTC (permalink / raw)
To: Fabiano Rosas
Cc: qemu-devel, Daniel P . Berrangé,
Vladimir Sementsov-Ogievskiy, Markus Armbruster,
Marc-André Lureau, Juraj Marcin
On Tue, Dec 02, 2025 at 10:55:04AM -0300, Fabiano Rosas wrote:
> Peter Xu <peterx@redhat.com> writes:
>
> > Based-on: <20251125070554.2256181-1-armbru@redhat.com>
> >
> > This series is based on Markus's recent fix:
> >
> > [PATCH] migration: Fix double-free on error path
> > https://lore.kernel.org/r/20251125070554.2256181-1-armbru@redhat.com
> >
> > v2:
> > - Added R-bs
> > - Patch 1:
> > - update commit message on s/accidentally merged/merged without proper
> > review/ [Markus]
> > - Patch 2:
> > - Added a new follow up patch here from Markus to poison Error's autoptr
> > - Patch 3:
> > - Rename migration_connect_set_error to migration_connect_error_propagate
> > [Markus]
> > - Add comments in commit log for both migrate_connect() and the rename
> > [Markus]
> > - Patch 4:
> > - Rename multifd_send_set_error to multifd_send_error_propagate [Markus]
> > - Patch 6:
> > - Make migrate_error_propagate() take MigrationState* as before [Markus]
> > - Remove the one use case of g_clear_pointer() [Markus]
> > - Touch up commit message for the change
> >
> > This series should address the issues discussed in this thread here:
> >
> > https://lore.kernel.org/r/871plmk1bc.fsf@pond.sub.org
>
> Thank you Markus for this. It's very helpful to have someone keeping us
> in check regarding the usage of generic QEMU interfaces. Migration code
> tends to drift incredibly..
>
> >
> > The problem is Error is not a good candidate of g_autoptr, however the
> > cleanup function was merged without enough review. Luckily, we only have
> > two users so far (after Markus's patch above lands). This series removes
> > the last two in migration code and reverts the auto cleanup function for
> > Error. Instead, poison the auto cleanup function.
> >
> > When at it, it'll also change migrate_set_error() to start taking ownership
> > of errors, just like what most error APIs do. When at it, it is renamed to
> > migrate_error_propagate() to imply migration version of error_propagate().
> >
> > Comments welcomed, thanks.
> >
>
> I think with this series we could now work to reduce the complexity of
> migration_connect():
>
> The outgoing code in socket.c and tls.c could call
> migration_connect_error_propagate directly so migration_channel_connect
> only needs to check migrate_has_error() and then exit as early as
> possible. From migration_connect onwards we can assume connection
> success.
>
> What do you think?
As long as you read commit 688a3dcba980bf and will manage all those, it
sounds like a good thing to try.
>
> tangent:
> (is it too much bikeshedding if I send a patch doing s/migrat*_/mig_/
> all over the place? it's so annoying having to check the code to get the
> prefix correct when writing emails)
From downstream POV, it'll be a slight burden whenever we need to backport
later patches to "the world before the rename". It's not a huge deal but
we should consider that.
I'd confess it's likely the best time to do this if you want it for
upstream POV - we don't have a lot concurrent projects ongoing, so if this
lands it can be in the 1st pull for 11.0.
If you have some "vibe coding" tools, maybe you can spend 2 mins to see how
it looks like and decide whether to send a patch. I would say don't spend
too much time on this (while you're still keep rebasing the options series! :)
--
Peter Xu
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH for-11.0 v2 8/7] migration: Use error_propagate() in migrate_error_propagate()
2025-12-01 19:45 [PATCH for-11.0 v2 0/7] migration: Error reporting cleanups Peter Xu
` (7 preceding siblings ...)
2025-12-02 13:55 ` [PATCH for-11.0 v2 0/7] migration: Error reporting cleanups Fabiano Rosas
@ 2025-12-02 17:53 ` Peter Xu
2025-12-03 7:38 ` Markus Armbruster
2025-12-03 14:33 ` [PATCH for-11.0 v2 0/7] migration: Error reporting cleanups Peter Xu
9 siblings, 1 reply; 18+ messages in thread
From: Peter Xu @ 2025-12-02 17:53 UTC (permalink / raw)
To: qemu-devel
Cc: berrange, farosas, vsementsov, armbru, marcandre.lureau, peterx,
jmarcin
It improves readability, as suggested by Markus.
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/migration.c | 7 +------
1 file changed, 1 insertion(+), 6 deletions(-)
diff --git a/migration/migration.c b/migration/migration.c
index 70813e5006..d55fde222a 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1555,12 +1555,7 @@ void migrate_error_propagate(MigrationState *s, Error *error)
{
QEMU_LOCK_GUARD(&s->error_mutex);
trace_migrate_error(error_get_pretty(error));
-
- if (!s->error) {
- s->error = error;
- } else {
- error_free(error);
- }
+ error_propagate(&s->error, error);
}
bool migrate_has_error(MigrationState *s)
--
2.50.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH for-11.0 v2 8/7] migration: Use error_propagate() in migrate_error_propagate()
2025-12-02 17:53 ` [PATCH for-11.0 v2 8/7] migration: Use error_propagate() in migrate_error_propagate() Peter Xu
@ 2025-12-03 7:38 ` Markus Armbruster
0 siblings, 0 replies; 18+ messages in thread
From: Markus Armbruster @ 2025-12-03 7:38 UTC (permalink / raw)
To: Peter Xu
Cc: qemu-devel, berrange, farosas, vsementsov, armbru,
marcandre.lureau, jmarcin
Peter Xu <peterx@redhat.com> writes:
> It improves readability, as suggested by Markus.
>
> Suggested-by: Markus Armbruster <armbru@redhat.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
> migration/migration.c | 7 +------
> 1 file changed, 1 insertion(+), 6 deletions(-)
>
> diff --git a/migration/migration.c b/migration/migration.c
> index 70813e5006..d55fde222a 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1555,12 +1555,7 @@ void migrate_error_propagate(MigrationState *s, Error *error)
> {
> QEMU_LOCK_GUARD(&s->error_mutex);
> trace_migrate_error(error_get_pretty(error));
> -
> - if (!s->error) {
> - s->error = error;
> - } else {
> - error_free(error);
> - }
> + error_propagate(&s->error, error);
> }
>
> bool migrate_has_error(MigrationState *s)
Reviewed-by: Markus Armbruster <armbru@redhat.com>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH for-11.0 v2 0/7] migration: Error reporting cleanups
2025-12-01 19:45 [PATCH for-11.0 v2 0/7] migration: Error reporting cleanups Peter Xu
` (8 preceding siblings ...)
2025-12-02 17:53 ` [PATCH for-11.0 v2 8/7] migration: Use error_propagate() in migrate_error_propagate() Peter Xu
@ 2025-12-03 14:33 ` Peter Xu
9 siblings, 0 replies; 18+ messages in thread
From: Peter Xu @ 2025-12-03 14:33 UTC (permalink / raw)
To: qemu-devel
Cc: Daniel P . Berrangé, Fabiano Rosas,
Vladimir Sementsov-Ogievskiy, Markus Armbruster,
Marc-André Lureau, Juraj Marcin
On Mon, Dec 01, 2025 at 02:45:03PM -0500, Peter Xu wrote:
> Based-on: <20251125070554.2256181-1-armbru@redhat.com>
>
> This series is based on Markus's recent fix:
>
> [PATCH] migration: Fix double-free on error path
> https://lore.kernel.org/r/20251125070554.2256181-1-armbru@redhat.com
>
> v2:
> - Added R-bs
> - Patch 1:
> - update commit message on s/accidentally merged/merged without proper
> review/ [Markus]
> - Patch 2:
> - Added a new follow up patch here from Markus to poison Error's autoptr
> - Patch 3:
> - Rename migration_connect_set_error to migration_connect_error_propagate
> [Markus]
> - Add comments in commit log for both migrate_connect() and the rename
> [Markus]
> - Patch 4:
> - Rename multifd_send_set_error to multifd_send_error_propagate [Markus]
> - Patch 6:
> - Make migrate_error_propagate() take MigrationState* as before [Markus]
> - Remove the one use case of g_clear_pointer() [Markus]
> - Touch up commit message for the change
>
> This series should address the issues discussed in this thread here:
>
> https://lore.kernel.org/r/871plmk1bc.fsf@pond.sub.org
>
> The problem is Error is not a good candidate of g_autoptr, however the
> cleanup function was merged without enough review. Luckily, we only have
> two users so far (after Markus's patch above lands). This series removes
> the last two in migration code and reverts the auto cleanup function for
> Error. Instead, poison the auto cleanup function.
>
> When at it, it'll also change migrate_set_error() to start taking ownership
> of errors, just like what most error APIs do. When at it, it is renamed to
> migrate_error_propagate() to imply migration version of error_propagate().
>
> Comments welcomed, thanks.
>
> Markus Armbruster (1):
> error: Poison g_autoptr(Error) to prevent its use
>
> Peter Xu (6):
> migration: Use explicit error_free() instead of g_autoptr
> Revert "error: define g_autoptr() cleanup function for the Error type"
> migration: Make migration_connect_set_error() own the error
> migration: Make multifd_send_set_error() own the error
> migration: Make multifd_recv_terminate_threads() own the error
> migration: Replace migrate_set_error() with migrate_error_propagate()
>
> include/qapi/error.h | 20 ++++++++++++-
> migration/migration.h | 2 +-
> migration/channel.c | 1 -
> migration/cpr-exec.c | 5 ++--
> migration/migration.c | 51 +++++++++++++++-----------------
> migration/multifd-device-state.c | 6 ++--
> migration/multifd.c | 30 +++++++++----------
> migration/postcopy-ram.c | 5 ++--
> migration/ram.c | 4 +--
> migration/savevm.c | 17 +++++------
> 10 files changed, 73 insertions(+), 68 deletions(-)
Thanks for the reviews, I queued all 8 patches for 11.0 (with small
tweaks per discussion).
--
Peter Xu
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2025-12-03 14:34 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-01 19:45 [PATCH for-11.0 v2 0/7] migration: Error reporting cleanups Peter Xu
2025-12-01 19:45 ` [PATCH for-11.0 v2 1/7] migration: Use explicit error_free() instead of g_autoptr Peter Xu
2025-12-01 19:45 ` [PATCH for-11.0 v2 2/7] Revert "error: define g_autoptr() cleanup function for the Error type" Peter Xu
2025-12-02 7:53 ` Cédric Le Goater
2025-12-01 19:45 ` [PATCH for-11.0 v2 3/7] error: Poison g_autoptr(Error) to prevent its use Peter Xu
2025-12-01 19:45 ` [PATCH for-11.0 v2 4/7] migration: Make migration_connect_set_error() own the error Peter Xu
2025-12-02 11:42 ` Markus Armbruster
2025-12-01 19:45 ` [PATCH for-11.0 v2 5/7] migration: Make multifd_send_set_error() " Peter Xu
2025-12-01 19:45 ` [PATCH for-11.0 v2 6/7] migration: Make multifd_recv_terminate_threads() " Peter Xu
2025-12-02 11:44 ` Markus Armbruster
2025-12-01 19:45 ` [PATCH for-11.0 v2 7/7] migration: Replace migrate_set_error() with migrate_error_propagate() Peter Xu
2025-12-02 11:59 ` Markus Armbruster
2025-12-02 17:28 ` Peter Xu
2025-12-02 13:55 ` [PATCH for-11.0 v2 0/7] migration: Error reporting cleanups Fabiano Rosas
2025-12-02 17:47 ` Peter Xu
2025-12-02 17:53 ` [PATCH for-11.0 v2 8/7] migration: Use error_propagate() in migrate_error_propagate() Peter Xu
2025-12-03 7:38 ` Markus Armbruster
2025-12-03 14:33 ` [PATCH for-11.0 v2 0/7] migration: Error reporting cleanups Peter Xu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).