* [PATCH v4 0/4] migration: do not exit on incoming failure
@ 2024-04-29 7:55 Vladimir Sementsov-Ogievskiy
2024-04-29 7:55 ` [PATCH v4 1/4] migration: move trace-point from migrate_fd_error to migrate_set_error Vladimir Sementsov-Ogievskiy
` (3 more replies)
0 siblings, 4 replies; 6+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2024-04-29 7:55 UTC (permalink / raw)
To: peterx, farosas; +Cc: eblake, armbru, pbonzini, qemu-devel, vsementsov, yc-core
Hi all!
The series brings an option to not immediately exit on incoming
migration failure, giving a possibility to orchestrator to get the error
through QAPI and shutdown QEMU by "quit".
v4:
- add r-b and a-b by Fabiano and Markus
- improve wording in 04 as Markus suggested
v3:
- don't refactor the whole code around setting migration error, it seems
too much and necessary for the new feature itself
- add constant
- change behavior for HMP command
- split some things to separate patches
- and more, by Peter's suggestions
New behavior can be demonstrated like this:
bash:
(
cat <<EOF
{'execute': 'qmp_capabilities'}
{'execute': 'migrate-set-capabilities', 'arguments': {'capabilities': [{'capability': 'events', 'state': true}]}}
{'execute': 'migrate-incoming', 'arguments': {'uri': 'exec:echo x', 'exit-on-error': false}}
EOF
sleep 1
cat <<EOF
{'execute': 'query-migrate'}
{'execute': 'quit'}
EOF
) | ./build/qemu-system-x86_64 -incoming 'defer' -qmp stdio -nographic -nodefaults
output:
{"QMP": {"version": {"qemu": {"micro": 50, "minor": 0, "major": 9}, "package": "v9.0.0-149-gb6295ad58c"}, "capabilities": ["oob"]}}
{"return": {}}
{"return": {}}
{"timestamp": {"seconds": 1714068847, "microseconds": 263907}, "event": "MIGRATION", "data": {"status": "setup"}}
{"return": {}}
{"timestamp": {"seconds": 1714068847, "microseconds": 266696}, "event": "MIGRATION", "data": {"status": "active"}}
qemu-system-x86_64: Not a migration stream
{"timestamp": {"seconds": 1714068847, "microseconds": 266766}, "event": "MIGRATION", "data": {"status": "failed"}}
{"return": {"status": "failed", "error-desc": "load of migration failed: Invalid argument"}}
{"timestamp": {"seconds": 1714068848, "microseconds": 237187}, "event": "SHUTDOWN", "data": {"guest": false, "reason": "host-qmp-quit"}}
{"return": {}}
Vladimir Sementsov-Ogievskiy (4):
migration: move trace-point from migrate_fd_error to migrate_set_error
migration: process_incoming_migration_co(): complete cleanup on
failure
migration: process_incoming_migration_co(): rework error reporting
qapi: introduce exit-on-error parameter for migrate-incoming
migration/migration-hmp-cmds.c | 2 +-
migration/migration.c | 57 ++++++++++++++++++++++++----------
migration/migration.h | 3 ++
migration/trace-events | 2 +-
qapi/migration.json | 7 ++++-
system/vl.c | 3 +-
6 files changed, 54 insertions(+), 20 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v4 1/4] migration: move trace-point from migrate_fd_error to migrate_set_error
2024-04-29 7:55 [PATCH v4 0/4] migration: do not exit on incoming failure Vladimir Sementsov-Ogievskiy
@ 2024-04-29 7:55 ` Vladimir Sementsov-Ogievskiy
2024-04-29 19:21 ` Peter Xu
2024-04-29 7:55 ` [PATCH v4 2/4] migration: process_incoming_migration_co(): complete cleanup on failure Vladimir Sementsov-Ogievskiy
` (2 subsequent siblings)
3 siblings, 1 reply; 6+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2024-04-29 7:55 UTC (permalink / raw)
To: peterx, farosas; +Cc: eblake, armbru, pbonzini, qemu-devel, vsementsov, yc-core
Cover more cases by trace-point.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
---
migration/migration.c | 4 +++-
migration/trace-events | 2 +-
2 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/migration/migration.c b/migration/migration.c
index b5af6b5105..2dc6a063e9 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1421,6 +1421,9 @@ static void migrate_fd_cleanup_bh(void *opaque)
void migrate_set_error(MigrationState *s, const Error *error)
{
QEMU_LOCK_GUARD(&s->error_mutex);
+
+ trace_migrate_error(error_get_pretty(error));
+
if (!s->error) {
s->error = error_copy(error);
}
@@ -1444,7 +1447,6 @@ static void migrate_error_free(MigrationState *s)
static void migrate_fd_error(MigrationState *s, const Error *error)
{
- trace_migrate_fd_error(error_get_pretty(error));
assert(s->to_dst_file == NULL);
migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
MIGRATION_STATUS_FAILED);
diff --git a/migration/trace-events b/migration/trace-events
index f0e1cb80c7..d0c44c3853 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -152,7 +152,7 @@ multifd_set_outgoing_channel(void *ioc, const char *ioctype, const char *hostnam
# migration.c
migrate_set_state(const char *new_state) "new state %s"
migrate_fd_cleanup(void) ""
-migrate_fd_error(const char *error_desc) "error=%s"
+migrate_error(const char *error_desc) "error=%s"
migrate_fd_cancel(void) ""
migrate_handle_rp_req_pages(const char *rbname, size_t start, size_t len) "in %s at 0x%zx len 0x%zx"
migrate_pending_exact(uint64_t size, uint64_t pre, uint64_t post) "exact pending size %" PRIu64 " (pre = %" PRIu64 " post=%" PRIu64 ")"
--
2.34.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v4 2/4] migration: process_incoming_migration_co(): complete cleanup on failure
2024-04-29 7:55 [PATCH v4 0/4] migration: do not exit on incoming failure Vladimir Sementsov-Ogievskiy
2024-04-29 7:55 ` [PATCH v4 1/4] migration: move trace-point from migrate_fd_error to migrate_set_error Vladimir Sementsov-Ogievskiy
@ 2024-04-29 7:55 ` Vladimir Sementsov-Ogievskiy
2024-04-29 7:55 ` [PATCH v4 3/4] migration: process_incoming_migration_co(): rework error reporting Vladimir Sementsov-Ogievskiy
2024-04-29 7:55 ` [PATCH v4 4/4] qapi: introduce exit-on-error parameter for migrate-incoming Vladimir Sementsov-Ogievskiy
3 siblings, 0 replies; 6+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2024-04-29 7:55 UTC (permalink / raw)
To: peterx, farosas; +Cc: eblake, armbru, pbonzini, qemu-devel, vsementsov, yc-core
Make call to migration_incoming_state_destroy(), instead of doing only
partial of it.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
---
migration/migration.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/migration/migration.c b/migration/migration.c
index 2dc6a063e9..0d26db47f7 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -799,10 +799,7 @@ process_incoming_migration_co(void *opaque)
fail:
migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
MIGRATION_STATUS_FAILED);
- qemu_fclose(mis->from_src_file);
-
- multifd_recv_cleanup();
- compress_threads_load_cleanup();
+ migration_incoming_state_destroy();
exit(EXIT_FAILURE);
}
--
2.34.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v4 3/4] migration: process_incoming_migration_co(): rework error reporting
2024-04-29 7:55 [PATCH v4 0/4] migration: do not exit on incoming failure Vladimir Sementsov-Ogievskiy
2024-04-29 7:55 ` [PATCH v4 1/4] migration: move trace-point from migrate_fd_error to migrate_set_error Vladimir Sementsov-Ogievskiy
2024-04-29 7:55 ` [PATCH v4 2/4] migration: process_incoming_migration_co(): complete cleanup on failure Vladimir Sementsov-Ogievskiy
@ 2024-04-29 7:55 ` Vladimir Sementsov-Ogievskiy
2024-04-29 7:55 ` [PATCH v4 4/4] qapi: introduce exit-on-error parameter for migrate-incoming Vladimir Sementsov-Ogievskiy
3 siblings, 0 replies; 6+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2024-04-29 7:55 UTC (permalink / raw)
To: peterx, farosas; +Cc: eblake, armbru, pbonzini, qemu-devel, vsementsov, yc-core
Unify error reporting in the function. This simplifies the following
commit, which will not-exit-on-error behavior variant to the function.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
---
migration/migration.c | 20 +++++++++++---------
1 file changed, 11 insertions(+), 9 deletions(-)
diff --git a/migration/migration.c b/migration/migration.c
index 0d26db47f7..0b15f7ccf4 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -735,14 +735,16 @@ static void process_incoming_migration_bh(void *opaque)
static void coroutine_fn
process_incoming_migration_co(void *opaque)
{
+ MigrationState *s = migrate_get_current();
MigrationIncomingState *mis = migration_incoming_get_current();
PostcopyState ps;
int ret;
+ Error *local_err = NULL;
assert(mis->from_src_file);
if (compress_threads_load_setup(mis->from_src_file)) {
- error_report("Failed to setup decompress threads");
+ error_setg(&local_err, "Failed to setup decompress threads");
goto fail;
}
@@ -779,18 +781,12 @@ process_incoming_migration_co(void *opaque)
}
if (ret < 0) {
- MigrationState *s = migrate_get_current();
-
- if (migrate_has_error(s)) {
- WITH_QEMU_LOCK_GUARD(&s->error_mutex) {
- error_report_err(s->error);
- }
- }
- error_report("load of migration failed: %s", strerror(-ret));
+ error_setg(&local_err, "load of migration failed: %s", strerror(-ret));
goto fail;
}
if (colo_incoming_co() < 0) {
+ error_setg(&local_err, "colo incoming failed");
goto fail;
}
@@ -801,6 +797,12 @@ fail:
MIGRATION_STATUS_FAILED);
migration_incoming_state_destroy();
+ if (migrate_has_error(s)) {
+ WITH_QEMU_LOCK_GUARD(&s->error_mutex) {
+ error_report_err(s->error);
+ }
+ }
+ error_report_err(local_err);
exit(EXIT_FAILURE);
}
--
2.34.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v4 4/4] qapi: introduce exit-on-error parameter for migrate-incoming
2024-04-29 7:55 [PATCH v4 0/4] migration: do not exit on incoming failure Vladimir Sementsov-Ogievskiy
` (2 preceding siblings ...)
2024-04-29 7:55 ` [PATCH v4 3/4] migration: process_incoming_migration_co(): rework error reporting Vladimir Sementsov-Ogievskiy
@ 2024-04-29 7:55 ` Vladimir Sementsov-Ogievskiy
3 siblings, 0 replies; 6+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2024-04-29 7:55 UTC (permalink / raw)
To: peterx, farosas; +Cc: eblake, armbru, pbonzini, qemu-devel, vsementsov, yc-core
Now we do set MIGRATION_FAILED state, but don't give a chance to
orchestrator to query migration state and get the error.
Let's provide a possibility for QMP-based orchestrators to get an error
like with outgoing migration.
For hmp_migrate_incoming(), let's enable the new behavior: HMP is not
and ABI, it's mostly intended to use by developer and it makes sense
not to stop the process.
For x-exit-preconfig, let's keep the old behavior:
- it's called from init(), so here we want to keep current behavior by
default
- it does exit on error by itself as well
So, if we want to change the behavior of x-exit-preconfig, it should be
another patch.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Acked-by: Markus Armbruster <armbru@redhat.com>
---
migration/migration-hmp-cmds.c | 2 +-
migration/migration.c | 36 ++++++++++++++++++++++++++++------
migration/migration.h | 3 +++
qapi/migration.json | 7 ++++++-
system/vl.c | 3 ++-
5 files changed, 42 insertions(+), 9 deletions(-)
diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c
index 7e96ae6ffd..23181bbee1 100644
--- a/migration/migration-hmp-cmds.c
+++ b/migration/migration-hmp-cmds.c
@@ -466,7 +466,7 @@ void hmp_migrate_incoming(Monitor *mon, const QDict *qdict)
}
QAPI_LIST_PREPEND(caps, g_steal_pointer(&channel));
- qmp_migrate_incoming(NULL, true, caps, &err);
+ qmp_migrate_incoming(NULL, true, caps, true, false, &err);
qapi_free_MigrationChannelList(caps);
end:
diff --git a/migration/migration.c b/migration/migration.c
index 0b15f7ccf4..5cfe420a76 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -72,6 +72,8 @@
#define NOTIFIER_ELEM_INIT(array, elem) \
[elem] = NOTIFIER_WITH_RETURN_LIST_INITIALIZER((array)[elem])
+#define INMIGRATE_DEFAULT_EXIT_ON_ERROR true
+
static NotifierWithReturnList migration_state_notifiers[] = {
NOTIFIER_ELEM_INIT(migration_state_notifiers, MIG_MODE_NORMAL),
NOTIFIER_ELEM_INIT(migration_state_notifiers, MIG_MODE_CPR_REBOOT),
@@ -234,6 +236,8 @@ void migration_object_init(void)
qemu_cond_init(¤t_incoming->page_request_cond);
current_incoming->page_requested = g_tree_new(page_request_addr_cmp);
+ current_incoming->exit_on_error = INMIGRATE_DEFAULT_EXIT_ON_ERROR;
+
migration_object_check(current_migration, &error_fatal);
blk_mig_init();
@@ -797,13 +801,18 @@ fail:
MIGRATION_STATUS_FAILED);
migration_incoming_state_destroy();
- if (migrate_has_error(s)) {
- WITH_QEMU_LOCK_GUARD(&s->error_mutex) {
- error_report_err(s->error);
+ if (mis->exit_on_error) {
+ if (migrate_has_error(s)) {
+ WITH_QEMU_LOCK_GUARD(&s->error_mutex) {
+ error_report_err(s->error);
+ }
}
+ error_report_err(local_err);
+ exit(EXIT_FAILURE);
+ } else {
+ migrate_set_error(s, local_err);
+ error_free(local_err);
}
- error_report_err(local_err);
- exit(EXIT_FAILURE);
}
/**
@@ -1312,6 +1321,15 @@ static void fill_destination_migration_info(MigrationInfo *info)
break;
}
info->status = mis->state;
+
+ if (!info->error_desc) {
+ MigrationState *s = migrate_get_current();
+ QEMU_LOCK_GUARD(&s->error_mutex);
+
+ if (s->error) {
+ info->error_desc = g_strdup(error_get_pretty(s->error));
+ }
+ }
}
MigrationInfo *qmp_query_migrate(Error **errp)
@@ -1795,10 +1813,13 @@ void migrate_del_blocker(Error **reasonp)
}
void qmp_migrate_incoming(const char *uri, bool has_channels,
- MigrationChannelList *channels, Error **errp)
+ MigrationChannelList *channels,
+ bool has_exit_on_error, bool exit_on_error,
+ Error **errp)
{
Error *local_err = NULL;
static bool once = true;
+ MigrationIncomingState *mis = migration_incoming_get_current();
if (!once) {
error_setg(errp, "The incoming migration has already been started");
@@ -1813,6 +1834,9 @@ void qmp_migrate_incoming(const char *uri, bool has_channels,
return;
}
+ mis->exit_on_error =
+ has_exit_on_error ? exit_on_error : INMIGRATE_DEFAULT_EXIT_ON_ERROR;
+
qemu_start_incoming_migration(uri, has_channels, channels, &local_err);
if (local_err) {
diff --git a/migration/migration.h b/migration/migration.h
index 8045e39c26..95995a818e 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -227,6 +227,9 @@ struct MigrationIncomingState {
* is needed as this field is updated serially.
*/
unsigned int switchover_ack_pending_num;
+
+ /* Do exit on incoming migration failure */
+ bool exit_on_error;
};
MigrationIncomingState *migration_incoming_get_current(void);
diff --git a/qapi/migration.json b/qapi/migration.json
index 8c65b90328..9feed413b5 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -1837,6 +1837,10 @@
# @channels: list of migration stream channels with each stream in the
# list connected to a destination interface endpoint.
#
+# @exit-on-error: Exit on incoming migration failure. Default true.
+# When set to false, the failure triggers a MIGRATION event, and
+# error details could be retrieved with query-migrate. (since 9.1)
+#
# Since: 2.3
#
# Notes:
@@ -1889,7 +1893,8 @@
##
{ 'command': 'migrate-incoming',
'data': {'*uri': 'str',
- '*channels': [ 'MigrationChannel' ] } }
+ '*channels': [ 'MigrationChannel' ],
+ '*exit-on-error': 'bool' } }
##
# @xen-save-devices-state:
diff --git a/system/vl.c b/system/vl.c
index 7756eac81e..79cd498395 100644
--- a/system/vl.c
+++ b/system/vl.c
@@ -2723,7 +2723,8 @@ void qmp_x_exit_preconfig(Error **errp)
if (incoming) {
Error *local_err = NULL;
if (strcmp(incoming, "defer") != 0) {
- qmp_migrate_incoming(incoming, false, NULL, &local_err);
+ qmp_migrate_incoming(incoming, false, NULL, true, true,
+ &local_err);
if (local_err) {
error_reportf_err(local_err, "-incoming %s: ", incoming);
exit(1);
--
2.34.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v4 1/4] migration: move trace-point from migrate_fd_error to migrate_set_error
2024-04-29 7:55 ` [PATCH v4 1/4] migration: move trace-point from migrate_fd_error to migrate_set_error Vladimir Sementsov-Ogievskiy
@ 2024-04-29 19:21 ` Peter Xu
0 siblings, 0 replies; 6+ messages in thread
From: Peter Xu @ 2024-04-29 19:21 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: farosas, eblake, armbru, pbonzini, qemu-devel, yc-core
On Mon, Apr 29, 2024 at 10:55:38AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> Cover more cases by trace-point.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Peter Xu
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-04-29 19:22 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-04-29 7:55 [PATCH v4 0/4] migration: do not exit on incoming failure Vladimir Sementsov-Ogievskiy
2024-04-29 7:55 ` [PATCH v4 1/4] migration: move trace-point from migrate_fd_error to migrate_set_error Vladimir Sementsov-Ogievskiy
2024-04-29 19:21 ` Peter Xu
2024-04-29 7:55 ` [PATCH v4 2/4] migration: process_incoming_migration_co(): complete cleanup on failure Vladimir Sementsov-Ogievskiy
2024-04-29 7:55 ` [PATCH v4 3/4] migration: process_incoming_migration_co(): rework error reporting Vladimir Sementsov-Ogievskiy
2024-04-29 7:55 ` [PATCH v4 4/4] qapi: introduce exit-on-error parameter for migrate-incoming Vladimir Sementsov-Ogievskiy
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).