* [PATCH v2 0/2] Field 'reason' for MIGRATION event
@ 2024-02-15 12:27 Roman Khapov
2024-02-15 12:27 ` [PATCH v2 1/2] qapi/migration.json: add reason to " Roman Khapov
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Roman Khapov @ 2024-02-15 12:27 UTC (permalink / raw)
To: qemu-devel; +Cc: peterx, farosas, eblake, armbru, yc-core, Roman Khapov
This is resending of series 20240215082659.1378342-1-rkhapov@yandex-team.ru,
where patch subjects numbers were broken in patch 2/2.
Sometimes, when migration fails, it is hard to find out
the cause of the problems: you have to grep qemu logs.
At the same time, there is MIGRATION event, which looks like
suitable place to hold such error descriptions.
To handle situation like this (maybe one day it will be useful
for other MIGRATION statuses to have additional 'reason' strings),
the general optional field 'reason' can be added.
The series proposes next changes:
1. Adding optional 'reason' field of type str into
qapi/migration.json MIGRATION event
2. Passing some error description as reason for every place, which
sets migration state to MIGRATION_STATUS_FAILED
After the series, MIGRATION event will looks like this:
{"execute": "qmp_capabilities"}
{"return": {}}
{"event": "MIGRATION", "data": {"status": "setup"}}
{"event": "MIGRATION", "data": {"status": "failed", "reason": "Failed to connect to '/tmp/sock.sock': No such file or directory"}}
Roman Khapov (2):
qapi/migration.json: add reason to MIGRATION event
migration: add error reason for failed MIGRATION events
migration/colo.c | 6 +-
migration/migration.c | 128 ++++++++++++++++++++++++++++--------------
migration/migration.h | 5 +-
migration/multifd.c | 10 ++--
migration/savevm.c | 24 ++++----
qapi/migration.json | 3 +-
6 files changed, 112 insertions(+), 64 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v2 1/2] qapi/migration.json: add reason to MIGRATION event
2024-02-15 12:27 [PATCH v2 0/2] Field 'reason' for MIGRATION event Roman Khapov
@ 2024-02-15 12:27 ` Roman Khapov
2024-02-16 6:17 ` Markus Armbruster
2024-02-20 5:39 ` Peter Xu
2024-02-15 12:27 ` [PATCH v2 2/2] migration: add error reason for failed MIGRATION events Roman Khapov
2024-02-21 14:45 ` [PATCH v2 0/2] Field 'reason' for MIGRATION event Fabiano Rosas
2 siblings, 2 replies; 12+ messages in thread
From: Roman Khapov @ 2024-02-15 12:27 UTC (permalink / raw)
To: qemu-devel; +Cc: peterx, farosas, eblake, armbru, yc-core, Roman Khapov
This commit adds the optional field reason for the events, which
contains the string, describing reason of status changing.
For example: reason of migration fail.
Function migrate_set_state now accepts 4th argument: the reason to
pass to event. Every call of this function appended with NULL argument.
Also migrate_set_state_err_reason was added to form reason from Error*
Signed-off-by: Roman Khapov <rkhapov@yandex-team.ru>
---
migration/colo.c | 6 +--
migration/migration.c | 86 ++++++++++++++++++++++++++-----------------
migration/migration.h | 5 ++-
migration/multifd.c | 6 +--
migration/savevm.c | 14 +++----
qapi/migration.json | 3 +-
6 files changed, 71 insertions(+), 49 deletions(-)
diff --git a/migration/colo.c b/migration/colo.c
index 315e31fe32..3c95291a83 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -102,7 +102,7 @@ static void secondary_vm_do_failover(void)
}
migrate_set_state(&mis->state, MIGRATION_STATUS_COLO,
- MIGRATION_STATUS_COMPLETED);
+ MIGRATION_STATUS_COMPLETED, NULL);
replication_stop_all(true, &local_err);
if (local_err) {
@@ -157,7 +157,7 @@ static void primary_vm_do_failover(void)
Error *local_err = NULL;
migrate_set_state(&s->state, MIGRATION_STATUS_COLO,
- MIGRATION_STATUS_COMPLETED);
+ MIGRATION_STATUS_COMPLETED, NULL);
/*
* kick COLO thread which might wait at
* qemu_sem_wait(&s->colo_checkpoint_sem).
@@ -823,7 +823,7 @@ static void *colo_process_incoming_thread(void *opaque)
qemu_sem_init(&mis->colo_incoming_sem, 0);
migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
- MIGRATION_STATUS_COLO);
+ MIGRATION_STATUS_COLO, NULL);
if (get_colo_mode() != COLO_MODE_SECONDARY) {
error_report("COLO mode must be COLO_MODE_SECONDARY");
diff --git a/migration/migration.c b/migration/migration.c
index ab21de2cad..d28885a55b 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -354,10 +354,10 @@ void migration_incoming_state_destroy(void)
yank_unregister_instance(MIGRATION_YANK_INSTANCE);
}
-static void migrate_generate_event(int new_state)
+static void migrate_generate_event(int new_state, const char *reason)
{
if (migrate_events()) {
- qapi_event_send_migration(new_state);
+ qapi_event_send_migration(new_state, reason);
}
}
@@ -598,7 +598,7 @@ static void qemu_start_incoming_migration(const char *uri, bool has_channels,
}
migrate_set_state(&mis->state, MIGRATION_STATUS_NONE,
- MIGRATION_STATUS_SETUP);
+ MIGRATION_STATUS_SETUP, NULL);
if (addr->transport == MIGRATION_ADDRESS_TYPE_SOCKET) {
SocketAddress *saddr = &addr->u.socket;
@@ -692,7 +692,7 @@ static void process_incoming_migration_bh(void *opaque)
* it's ready to use.
*/
migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
- MIGRATION_STATUS_COMPLETED);
+ MIGRATION_STATUS_COMPLETED, NULL);
migration_incoming_state_destroy();
}
@@ -713,7 +713,7 @@ process_incoming_migration_co(void *opaque)
mis->largest_page_size = qemu_ram_pagesize_largest();
postcopy_state_set(POSTCOPY_INCOMING_NONE);
migrate_set_state(&mis->state, MIGRATION_STATUS_SETUP,
- MIGRATION_STATUS_ACTIVE);
+ MIGRATION_STATUS_ACTIVE, NULL);
mis->loadvm_co = qemu_coroutine_self();
ret = qemu_loadvm_state(mis->from_src_file);
@@ -762,7 +762,7 @@ process_incoming_migration_co(void *opaque)
return;
fail:
migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
- MIGRATION_STATUS_FAILED);
+ MIGRATION_STATUS_FAILED, NULL);
qemu_fclose(mis->from_src_file);
multifd_recv_cleanup();
@@ -808,7 +808,7 @@ static bool postcopy_try_recover(void)
mis->to_src_file = qemu_file_get_return_path(mis->from_src_file);
migrate_set_state(&mis->state, MIGRATION_STATUS_POSTCOPY_PAUSED,
- MIGRATION_STATUS_POSTCOPY_RECOVER);
+ MIGRATION_STATUS_POSTCOPY_RECOVER, NULL);
/*
* Here, we only wake up the main loading thread (while the
@@ -1308,15 +1308,27 @@ void qmp_migrate_start_postcopy(Error **errp)
/* shared migration helpers */
-void migrate_set_state(int *state, int old_state, int new_state)
+void migrate_set_state(int *state, int old_state,
+ int new_state, const char *reason)
{
assert(new_state < MIGRATION_STATUS__MAX);
if (qatomic_cmpxchg(state, old_state, new_state) == old_state) {
trace_migrate_set_state(MigrationStatus_str(new_state));
- migrate_generate_event(new_state);
+ migrate_generate_event(new_state, reason);
}
}
+void migrate_set_state_err_reason(int *state, int old_state,
+ int new_state, const Error *err)
+{
+ const char *reason = NULL;
+ if (likely(err != NULL)) {
+ reason = error_get_pretty(err);
+ }
+
+ migrate_set_state(state, old_state, new_state, reason);
+}
+
static void migrate_fd_cleanup(MigrationState *s)
{
g_free(s->hostname);
@@ -1360,7 +1372,7 @@ static void migrate_fd_cleanup(MigrationState *s)
if (s->state == MIGRATION_STATUS_CANCELLING) {
migrate_set_state(&s->state, MIGRATION_STATUS_CANCELLING,
- MIGRATION_STATUS_CANCELLED);
+ MIGRATION_STATUS_CANCELLED, NULL);
}
if (s->error) {
@@ -1406,7 +1418,7 @@ static void migrate_fd_error(MigrationState *s, const Error *error)
trace_migrate_fd_error(error_get_pretty(error));
assert(s->to_dst_file == NULL);
migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
- MIGRATION_STATUS_FAILED);
+ MIGRATION_STATUS_FAILED, NULL);
migrate_set_error(s, error);
}
@@ -1432,7 +1444,8 @@ static void migrate_fd_cancel(MigrationState *s)
if (old_state == MIGRATION_STATUS_PRE_SWITCHOVER) {
qemu_sem_post(&s->pause_sem);
}
- migrate_set_state(&s->state, old_state, MIGRATION_STATUS_CANCELLING);
+ migrate_set_state(&s->state, old_state,
+ MIGRATION_STATUS_CANCELLING, NULL);
} while (s->state != MIGRATION_STATUS_CANCELLING);
/*
@@ -1612,7 +1625,8 @@ int migrate_init(MigrationState *s, Error **errp)
s->error = NULL;
s->vmdesc = NULL;
- migrate_set_state(&s->state, MIGRATION_STATUS_NONE, MIGRATION_STATUS_SETUP);
+ migrate_set_state(&s->state, MIGRATION_STATUS_NONE,
+ MIGRATION_STATUS_SETUP, NULL);
s->start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
s->total_time = 0;
@@ -2024,7 +2038,7 @@ void qmp_migrate(const char *uri, bool has_channels,
error_setg(&local_err, QERR_INVALID_PARAMETER_VALUE, "uri",
"a valid migration protocol");
migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
- MIGRATION_STATUS_FAILED);
+ MIGRATION_STATUS_FAILED, NULL);
block_cleanup_parameters();
}
@@ -2145,7 +2159,7 @@ static bool migrate_handle_rp_resume_ack(MigrationState *s,
/* Now both sides are active. */
migrate_set_state(&s->state, MIGRATION_STATUS_POSTCOPY_RECOVER,
- MIGRATION_STATUS_POSTCOPY_ACTIVE);
+ MIGRATION_STATUS_POSTCOPY_ACTIVE, NULL);
/* Notify send thread that time to continue send pages */
migration_rp_kick(s);
@@ -2422,14 +2436,15 @@ static int postcopy_start(MigrationState *ms, Error **errp)
if (migrate_postcopy_preempt()) {
migration_wait_main_channel(ms);
if (postcopy_preempt_establish_channel(ms)) {
- migrate_set_state(&ms->state, ms->state, MIGRATION_STATUS_FAILED);
+ migrate_set_state(&ms->state, ms->state,
+ MIGRATION_STATUS_FAILED, NULL);
return -1;
}
}
if (!migrate_pause_before_switchover()) {
migrate_set_state(&ms->state, MIGRATION_STATUS_ACTIVE,
- MIGRATION_STATUS_POSTCOPY_ACTIVE);
+ MIGRATION_STATUS_POSTCOPY_ACTIVE, NULL);
}
trace_postcopy_start();
@@ -2559,7 +2574,7 @@ static int postcopy_start(MigrationState *ms, Error **errp)
if (ret) {
error_setg(errp, "postcopy_start: Migration stream errored");
migrate_set_state(&ms->state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
- MIGRATION_STATUS_FAILED);
+ MIGRATION_STATUS_FAILED, NULL);
}
trace_postcopy_preempt_enabled(migrate_postcopy_preempt());
@@ -2570,7 +2585,7 @@ fail_closefb:
qemu_fclose(fb);
fail:
migrate_set_state(&ms->state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
- MIGRATION_STATUS_FAILED);
+ MIGRATION_STATUS_FAILED, NULL);
if (restart_block) {
/* A failure happened early enough that we know the destination hasn't
* accessed block devices, so we're safe to recover.
@@ -2618,10 +2633,10 @@ static int migration_maybe_pause(MigrationState *s,
if (s->state != MIGRATION_STATUS_CANCELLING) {
bql_unlock();
migrate_set_state(&s->state, *current_active_state,
- MIGRATION_STATUS_PRE_SWITCHOVER);
+ MIGRATION_STATUS_PRE_SWITCHOVER, NULL);
qemu_sem_wait(&s->pause_sem);
migrate_set_state(&s->state, MIGRATION_STATUS_PRE_SWITCHOVER,
- new_state);
+ new_state, NULL);
*current_active_state = new_state;
bql_lock();
}
@@ -2706,7 +2721,7 @@ static void migration_completion_failed(MigrationState *s,
}
migrate_set_state(&s->state, current_active_state,
- MIGRATION_STATUS_FAILED);
+ MIGRATION_STATUS_FAILED, NULL);
}
/**
@@ -2744,10 +2759,10 @@ static void migration_completion(MigrationState *s)
if (migrate_colo() && s->state == MIGRATION_STATUS_ACTIVE) {
/* COLO does not support postcopy */
migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
- MIGRATION_STATUS_COLO);
+ MIGRATION_STATUS_COLO, NULL);
} else {
migrate_set_state(&s->state, current_active_state,
- MIGRATION_STATUS_COMPLETED);
+ MIGRATION_STATUS_COMPLETED, NULL);
}
return;
@@ -2785,12 +2800,12 @@ static void bg_migration_completion(MigrationState *s)
}
migrate_set_state(&s->state, current_active_state,
- MIGRATION_STATUS_COMPLETED);
+ MIGRATION_STATUS_COMPLETED, NULL);
return;
fail:
migrate_set_state(&s->state, current_active_state,
- MIGRATION_STATUS_FAILED);
+ MIGRATION_STATUS_FAILED, NULL);
}
typedef enum MigThrError {
@@ -2901,7 +2916,7 @@ static MigThrError postcopy_pause(MigrationState *s)
close_return_path_on_source(s);
migrate_set_state(&s->state, s->state,
- MIGRATION_STATUS_POSTCOPY_PAUSED);
+ MIGRATION_STATUS_POSTCOPY_PAUSED, NULL);
error_report("Detected IO failure for postcopy. "
"Migration paused.");
@@ -2979,7 +2994,7 @@ static MigThrError migration_detect_error(MigrationState *s)
* For precopy (or postcopy with error outside IO), we fail
* with no time.
*/
- migrate_set_state(&s->state, state, MIGRATION_STATUS_FAILED);
+ migrate_set_state(&s->state, state, MIGRATION_STATUS_FAILED, NULL);
trace_migration_thread_file_err();
/* Time to stop the migration, now. */
@@ -3279,7 +3294,8 @@ static void qemu_savevm_wait_unplug(MigrationState *s, int old_state,
int new_state)
{
if (qemu_savevm_state_guest_unplug_pending()) {
- migrate_set_state(&s->state, old_state, MIGRATION_STATUS_WAIT_UNPLUG);
+ migrate_set_state(&s->state, old_state,
+ MIGRATION_STATUS_WAIT_UNPLUG, NULL);
while (s->state == MIGRATION_STATUS_WAIT_UNPLUG &&
qemu_savevm_state_guest_unplug_pending()) {
@@ -3302,9 +3318,10 @@ static void qemu_savevm_wait_unplug(MigrationState *s, int old_state,
}
}
- migrate_set_state(&s->state, MIGRATION_STATUS_WAIT_UNPLUG, new_state);
+ migrate_set_state(&s->state, MIGRATION_STATUS_WAIT_UNPLUG,
+ new_state, NULL);
} else {
- migrate_set_state(&s->state, old_state, new_state);
+ migrate_set_state(&s->state, old_state, new_state, NULL);
}
}
@@ -3549,7 +3566,7 @@ static void *bg_migration_thread(void *opaque)
fail:
if (early_fail) {
migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
- MIGRATION_STATUS_FAILED);
+ MIGRATION_STATUS_FAILED, NULL);
bql_unlock();
}
@@ -3615,7 +3632,8 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
if (migrate_postcopy_ram() || migrate_return_path()) {
if (open_return_path_on_source(s)) {
error_setg(&local_err, "Unable to open return-path for postcopy");
- migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED);
+ migrate_set_state(&s->state, s->state,
+ MIGRATION_STATUS_FAILED, NULL);
migrate_set_error(s, local_err);
error_report_err(local_err);
migrate_fd_cleanup(s);
@@ -3635,7 +3653,7 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
if (resume) {
/* Wakeup the main migration thread to do the recovery */
migrate_set_state(&s->state, MIGRATION_STATUS_POSTCOPY_PAUSED,
- MIGRATION_STATUS_POSTCOPY_RECOVER);
+ MIGRATION_STATUS_POSTCOPY_RECOVER, NULL);
qemu_sem_post(&s->postcopy_pause_sem);
return;
}
diff --git a/migration/migration.h b/migration/migration.h
index f2c8b8f286..9150478654 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -468,7 +468,10 @@ struct MigrationState {
bool rdma_migration;
};
-void migrate_set_state(int *state, int old_state, int new_state);
+void migrate_set_state(int *state, int old_state,
+ int new_state, const char *reason);
+void migrate_set_state_err_reason(int *state, int old_state,
+ int new_state, const Error *err);
void migration_fd_process_incoming(QEMUFile *f);
void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp);
diff --git a/migration/multifd.c b/migration/multifd.c
index adfe8c9a0a..da3d397642 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -595,7 +595,7 @@ static void multifd_send_set_error(Error *err)
s->state == MIGRATION_STATUS_DEVICE ||
s->state == MIGRATION_STATUS_ACTIVE) {
migrate_set_state(&s->state, s->state,
- MIGRATION_STATUS_FAILED);
+ MIGRATION_STATUS_FAILED, NULL);
}
}
}
@@ -1056,7 +1056,7 @@ bool multifd_send_setup(void)
migrate_set_error(s, local_err);
error_report_err(local_err);
migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
- MIGRATION_STATUS_FAILED);
+ MIGRATION_STATUS_FAILED, NULL);
return false;
}
@@ -1087,7 +1087,7 @@ static void multifd_recv_terminate_threads(Error *err)
if (s->state == MIGRATION_STATUS_SETUP ||
s->state == MIGRATION_STATUS_ACTIVE) {
migrate_set_state(&s->state, s->state,
- MIGRATION_STATUS_FAILED);
+ MIGRATION_STATUS_FAILED, NULL);
}
}
diff --git a/migration/savevm.c b/migration/savevm.c
index d612c8a902..be6cce8a51 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1744,7 +1744,7 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp)
} else {
status = MIGRATION_STATUS_COMPLETED;
}
- migrate_set_state(&ms->state, MIGRATION_STATUS_SETUP, status);
+ migrate_set_state(&ms->state, MIGRATION_STATUS_SETUP, status, NULL);
/* f is outer parameter, it should not stay in global migration state after
* this function finished */
@@ -1999,7 +1999,7 @@ static void *postcopy_ram_listen_thread(void *opaque)
object_ref(OBJECT(migr));
migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
- MIGRATION_STATUS_POSTCOPY_ACTIVE);
+ MIGRATION_STATUS_POSTCOPY_ACTIVE, NULL);
qemu_sem_post(&mis->thread_sync_sem);
trace_postcopy_ram_listen_thread_start();
@@ -2037,7 +2037,7 @@ static void *postcopy_ram_listen_thread(void *opaque)
} else {
error_report("%s: loadvm failed: %d", __func__, load_res);
migrate_set_state(&mis->state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
- MIGRATION_STATUS_FAILED);
+ MIGRATION_STATUS_FAILED, NULL);
}
}
if (load_res >= 0) {
@@ -2062,7 +2062,7 @@ static void *postcopy_ram_listen_thread(void *opaque)
}
migrate_set_state(&mis->state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
- MIGRATION_STATUS_COMPLETED);
+ MIGRATION_STATUS_COMPLETED, NULL);
/*
* If everything has worked fine, then the main thread has waited
* for us to start, and we're the last use of the mis.
@@ -2257,7 +2257,7 @@ static int loadvm_postcopy_handle_resume(MigrationIncomingState *mis)
* This means source VM is ready to resume the postcopy migration.
*/
migrate_set_state(&mis->state, MIGRATION_STATUS_POSTCOPY_RECOVER,
- MIGRATION_STATUS_POSTCOPY_ACTIVE);
+ MIGRATION_STATUS_POSTCOPY_ACTIVE, NULL);
trace_loadvm_postcopy_handle_resume();
@@ -2818,8 +2818,8 @@ static bool postcopy_pause_incoming(MigrationIncomingState *mis)
}
/* Current state can be either ACTIVE or RECOVER */
- migrate_set_state(&mis->state, mis->state,
- MIGRATION_STATUS_POSTCOPY_PAUSED);
+ migrate_set_state(&mis->state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
+ MIGRATION_STATUS_POSTCOPY_PAUSED, NULL);
/* Notify the fault thread for the invalidated file handle */
postcopy_fault_thread_notify(mis);
diff --git a/qapi/migration.json b/qapi/migration.json
index 5a565d9b8d..33bb5b7f50 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -1392,6 +1392,7 @@
# Emitted when a migration event happens
#
# @status: @MigrationStatus describing the current migration status.
+# @reason: Optional description of status changing reason.
#
# Since: 2.4
#
@@ -1402,7 +1403,7 @@
# "data": {"status": "completed"} }
##
{ 'event': 'MIGRATION',
- 'data': {'status': 'MigrationStatus'}}
+ 'data': {'status': 'MigrationStatus', '*reason': 'str'}}
##
# @MIGRATION_PASS:
--
2.34.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v2 2/2] migration: add error reason for failed MIGRATION events
2024-02-15 12:27 [PATCH v2 0/2] Field 'reason' for MIGRATION event Roman Khapov
2024-02-15 12:27 ` [PATCH v2 1/2] qapi/migration.json: add reason to " Roman Khapov
@ 2024-02-15 12:27 ` Roman Khapov
2024-02-21 14:45 ` [PATCH v2 0/2] Field 'reason' for MIGRATION event Fabiano Rosas
2 siblings, 0 replies; 12+ messages in thread
From: Roman Khapov @ 2024-02-15 12:27 UTC (permalink / raw)
To: qemu-devel; +Cc: peterx, farosas, eblake, armbru, yc-core, Roman Khapov
This patch adds error description as reason for event MIGRATION in every
place that generates MIGRATION_STATE_FAILED
Signed-off-by: Roman Khapov <rkhapov@yandex-team.ru>
---
migration/migration.c | 62 ++++++++++++++++++++++++++++++-------------
migration/multifd.c | 8 +++---
migration/savevm.c | 12 ++++-----
3 files changed, 54 insertions(+), 28 deletions(-)
diff --git a/migration/migration.c b/migration/migration.c
index d28885a55b..0af16d5fa9 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -702,11 +702,14 @@ process_incoming_migration_co(void *opaque)
MigrationIncomingState *mis = migration_incoming_get_current();
PostcopyState ps;
int ret;
+ g_autofree char *fail_reason = NULL;
assert(mis->from_src_file);
if (compress_threads_load_setup(mis->from_src_file)) {
- error_report("Failed to setup decompress threads");
+ fail_reason = g_strdup("Failed to setup decompress threads");
+ /* wrap with %s to silence compiler warning of non-literal in format */
+ error_report("%s", fail_reason);
goto fail;
}
@@ -750,11 +753,15 @@ process_incoming_migration_co(void *opaque)
error_report_err(s->error);
}
}
- error_report("load of migration failed: %s", strerror(-ret));
+ fail_reason = g_strdup_printf("load of migration failed: %s",
+ strerror(-ret));
+ /* wrap with %s to silence compiler warning of non-literal in format */
+ error_report("%s", fail_reason);
goto fail;
}
if (colo_incoming_co() < 0) {
+ fail_reason = g_strdup("colo_incoming failed");
goto fail;
}
@@ -762,7 +769,7 @@ process_incoming_migration_co(void *opaque)
return;
fail:
migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
- MIGRATION_STATUS_FAILED, NULL);
+ MIGRATION_STATUS_FAILED, fail_reason);
qemu_fclose(mis->from_src_file);
multifd_recv_cleanup();
@@ -1417,8 +1424,8 @@ static void migrate_fd_error(MigrationState *s, const Error *error)
{
trace_migrate_fd_error(error_get_pretty(error));
assert(s->to_dst_file == NULL);
- migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
- MIGRATION_STATUS_FAILED, NULL);
+ migrate_set_state_err_reason(&s->state, MIGRATION_STATUS_SETUP,
+ MIGRATION_STATUS_FAILED, error);
migrate_set_error(s, error);
}
@@ -1968,6 +1975,7 @@ void qmp_migrate(const char *uri, bool has_channels,
bool has_inc, bool inc, bool has_detach, bool detach,
bool has_resume, bool resume, Error **errp)
{
+ ERRP_GUARD();
bool resume_requested;
Error *local_err = NULL;
MigrationState *s = migrate_get_current();
@@ -2037,8 +2045,8 @@ void qmp_migrate(const char *uri, bool has_channels,
} else {
error_setg(&local_err, QERR_INVALID_PARAMETER_VALUE, "uri",
"a valid migration protocol");
- migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
- MIGRATION_STATUS_FAILED, NULL);
+ migrate_set_state_err_reason(&s->state, MIGRATION_STATUS_SETUP,
+ MIGRATION_STATUS_FAILED, *errp);
block_cleanup_parameters();
}
@@ -2426,6 +2434,7 @@ migration_wait_main_channel(MigrationState *ms)
*/
static int postcopy_start(MigrationState *ms, Error **errp)
{
+ ERRP_GUARD();
int ret;
QIOChannelBuffer *bioc;
QEMUFile *fb;
@@ -2436,8 +2445,10 @@ static int postcopy_start(MigrationState *ms, Error **errp)
if (migrate_postcopy_preempt()) {
migration_wait_main_channel(ms);
if (postcopy_preempt_establish_channel(ms)) {
- migrate_set_state(&ms->state, ms->state,
- MIGRATION_STATUS_FAILED, NULL);
+ error_setg(errp,
+ "postcopy_start: establishing channel failed");
+ migrate_set_state_err_reason(&ms->state, ms->state,
+ MIGRATION_STATUS_FAILED, *errp);
return -1;
}
}
@@ -2456,17 +2467,21 @@ static int postcopy_start(MigrationState *ms, Error **errp)
global_state_store();
ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
if (ret < 0) {
+ error_setg(errp, "postcopy_start: vm stop failed");
goto fail;
}
ret = migration_maybe_pause(ms, &cur_state,
MIGRATION_STATUS_POSTCOPY_ACTIVE);
if (ret < 0) {
+ error_setg(errp, "postcopy_start: migratoin pause failed");
goto fail;
}
ret = bdrv_inactivate_all();
if (ret < 0) {
+ error_setg(errp,
+ "postcopy_start: making block drivers inactive failed");
goto fail;
}
restart_block = true;
@@ -2543,6 +2558,7 @@ static int postcopy_start(MigrationState *ms, Error **errp)
/* Now send that blob */
if (qemu_savevm_send_packaged(ms->to_dst_file, bioc->data, bioc->usage)) {
+ error_setg(errp, "postcopy_start: blob sending failed");
goto fail_closefb;
}
qemu_fclose(fb);
@@ -2573,8 +2589,9 @@ static int postcopy_start(MigrationState *ms, Error **errp)
ret = qemu_file_get_error(ms->to_dst_file);
if (ret) {
error_setg(errp, "postcopy_start: Migration stream errored");
- migrate_set_state(&ms->state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
- MIGRATION_STATUS_FAILED, NULL);
+ migrate_set_state_err_reason(&ms->state,
+ MIGRATION_STATUS_POSTCOPY_ACTIVE,
+ MIGRATION_STATUS_FAILED, *errp);
}
trace_postcopy_preempt_enabled(migrate_postcopy_preempt());
@@ -2584,8 +2601,8 @@ static int postcopy_start(MigrationState *ms, Error **errp)
fail_closefb:
qemu_fclose(fb);
fail:
- migrate_set_state(&ms->state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
- MIGRATION_STATUS_FAILED, NULL);
+ migrate_set_state_err_reason(&ms->state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
+ MIGRATION_STATUS_FAILED, *errp);
if (restart_block) {
/* A failure happened early enough that we know the destination hasn't
* accessed block devices, so we're safe to recover.
@@ -2700,7 +2717,8 @@ static void migration_completion_postcopy(MigrationState *s)
}
static void migration_completion_failed(MigrationState *s,
- int current_active_state)
+ int current_active_state,
+ const char *fail_reason)
{
if (s->block_inactive && (s->state == MIGRATION_STATUS_ACTIVE ||
s->state == MIGRATION_STATUS_DEVICE)) {
@@ -2721,7 +2739,7 @@ static void migration_completion_failed(MigrationState *s,
}
migrate_set_state(&s->state, current_active_state,
- MIGRATION_STATUS_FAILED, NULL);
+ MIGRATION_STATUS_FAILED, fail_reason);
}
/**
@@ -2733,6 +2751,7 @@ static void migration_completion_failed(MigrationState *s,
static void migration_completion(MigrationState *s)
{
int ret = 0;
+ const char *fail_reason = NULL;
int current_active_state = s->state;
if (s->state == MIGRATION_STATUS_ACTIVE) {
@@ -2740,6 +2759,7 @@ static void migration_completion(MigrationState *s)
} else if (s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
migration_completion_postcopy(s);
} else {
+ fail_reason = "migration completion: unexpected migration state";
ret = -1;
}
@@ -2748,6 +2768,7 @@ static void migration_completion(MigrationState *s)
}
if (close_return_path_on_source(s)) {
+ fail_reason = "migration completion: return path thread close failed";
goto fail;
}
@@ -2768,7 +2789,7 @@ static void migration_completion(MigrationState *s)
return;
fail:
- migration_completion_failed(s, current_active_state);
+ migration_completion_failed(s, current_active_state, fail_reason);
}
/**
@@ -2994,7 +3015,8 @@ static MigThrError migration_detect_error(MigrationState *s)
* For precopy (or postcopy with error outside IO), we fail
* with no time.
*/
- migrate_set_state(&s->state, state, MIGRATION_STATUS_FAILED, NULL);
+ migrate_set_state_err_reason(&s->state, state,
+ MIGRATION_STATUS_FAILED, s->error);
trace_migration_thread_file_err();
/* Time to stop the migration, now. */
@@ -3458,6 +3480,7 @@ static void *bg_migration_thread(void *opaque)
MigThrError thr_error;
QEMUFile *fb;
bool early_fail = true;
+ const char *fail_reason = NULL;
rcu_register_thread();
object_ref(OBJECT(s));
@@ -3509,6 +3532,7 @@ static void *bg_migration_thread(void *opaque)
global_state_store();
/* Forcibly stop VM before saving state of vCPUs and devices */
if (migration_stop_vm(RUN_STATE_PAUSED)) {
+ fail_reason = "stopping vm failed";
goto fail;
}
/*
@@ -3517,6 +3541,7 @@ static void *bg_migration_thread(void *opaque)
*/
cpu_synchronize_all_states();
if (qemu_savevm_state_complete_precopy_non_iterable(fb, false, false)) {
+ fail_reason = "savevm state failed";
goto fail;
}
/*
@@ -3527,6 +3552,7 @@ static void *bg_migration_thread(void *opaque)
/* Now initialize UFFD context and start tracking RAM writes */
if (ram_write_tracking_start()) {
+ fail_reason = "starting UFFD-WP memory tracking failed";
goto fail;
}
early_fail = false;
@@ -3566,7 +3592,7 @@ static void *bg_migration_thread(void *opaque)
fail:
if (early_fail) {
migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
- MIGRATION_STATUS_FAILED, NULL);
+ MIGRATION_STATUS_FAILED, fail_reason);
bql_unlock();
}
diff --git a/migration/multifd.c b/migration/multifd.c
index da3d397642..cb52ebc062 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -594,8 +594,8 @@ static void multifd_send_set_error(Error *err)
s->state == MIGRATION_STATUS_PRE_SWITCHOVER ||
s->state == MIGRATION_STATUS_DEVICE ||
s->state == MIGRATION_STATUS_ACTIVE) {
- migrate_set_state(&s->state, s->state,
- MIGRATION_STATUS_FAILED, NULL);
+ migrate_set_state_err_reason(&s->state, s->state,
+ MIGRATION_STATUS_FAILED, err);
}
}
}
@@ -1086,8 +1086,8 @@ static void multifd_recv_terminate_threads(Error *err)
migrate_set_error(s, err);
if (s->state == MIGRATION_STATUS_SETUP ||
s->state == MIGRATION_STATUS_ACTIVE) {
- migrate_set_state(&s->state, s->state,
- MIGRATION_STATUS_FAILED, NULL);
+ migrate_set_state_err_reason(&s->state, s->state,
+ MIGRATION_STATUS_FAILED, err);
}
}
diff --git a/migration/savevm.c b/migration/savevm.c
index be6cce8a51..52fd3e37db 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1700,9 +1700,9 @@ void qemu_savevm_state_cleanup(void)
static int qemu_savevm_state(QEMUFile *f, Error **errp)
{
+ ERRP_GUARD();
int ret;
MigrationState *ms = migrate_get_current();
- MigrationStatus status;
if (migration_is_running(ms->state)) {
error_setg(errp, QERR_MIGRATION_ACTIVE);
@@ -1735,16 +1735,16 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp)
ret = qemu_file_get_error(f);
}
qemu_savevm_state_cleanup();
+
if (ret != 0) {
error_setg_errno(errp, -ret, "Error while writing VM state");
- }
- if (ret != 0) {
- status = MIGRATION_STATUS_FAILED;
+ migrate_set_state_err_reason(&ms->state, MIGRATION_STATUS_SETUP,
+ MIGRATION_STATUS_FAILED, *errp);
} else {
- status = MIGRATION_STATUS_COMPLETED;
+ migrate_set_state(&ms->state, MIGRATION_STATUS_SETUP,
+ MIGRATION_STATUS_COMPLETED, NULL);
}
- migrate_set_state(&ms->state, MIGRATION_STATUS_SETUP, status, NULL);
/* f is outer parameter, it should not stay in global migration state after
* this function finished */
--
2.34.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/2] qapi/migration.json: add reason to MIGRATION event
2024-02-15 12:27 ` [PATCH v2 1/2] qapi/migration.json: add reason to " Roman Khapov
@ 2024-02-16 6:17 ` Markus Armbruster
2024-02-18 14:32 ` Roman Khapov
2024-02-20 5:39 ` Peter Xu
1 sibling, 1 reply; 12+ messages in thread
From: Markus Armbruster @ 2024-02-16 6:17 UTC (permalink / raw)
To: Roman Khapov; +Cc: qemu-devel, peterx, farosas, eblake, yc-core
Roman Khapov <rkhapov@yandex-team.ru> writes:
> This commit adds the optional field reason for the events, which
> contains the string, describing reason of status changing.
> For example: reason of migration fail.
>
> Function migrate_set_state now accepts 4th argument: the reason to
> pass to event. Every call of this function appended with NULL argument.
>
> Also migrate_set_state_err_reason was added to form reason from Error*
>
> Signed-off-by: Roman Khapov <rkhapov@yandex-team.ru>
[...]
> diff --git a/qapi/migration.json b/qapi/migration.json
> index 5a565d9b8d..33bb5b7f50 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -1392,6 +1392,7 @@
> # Emitted when a migration event happens
> #
> # @status: @MigrationStatus describing the current migration status.
> +# @reason: Optional description of status changing reason.
Intended use?
When is it present?
> #
> # Since: 2.4
> #
> @@ -1402,7 +1403,7 @@
> # "data": {"status": "completed"} }
> ##
> { 'event': 'MIGRATION',
> - 'data': {'status': 'MigrationStatus'}}
> + 'data': {'status': 'MigrationStatus', '*reason': 'str'}}
>
> ##
> # @MIGRATION_PASS:
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/2] qapi/migration.json: add reason to MIGRATION event
2024-02-16 6:17 ` Markus Armbruster
@ 2024-02-18 14:32 ` Roman Khapov
2024-02-19 6:35 ` Markus Armbruster
0 siblings, 1 reply; 12+ messages in thread
From: Roman Khapov @ 2024-02-18 14:32 UTC (permalink / raw)
To: Markus Armbruster; +Cc: qemu-devel, peterx, farosas, eblake, yc-core
To be clear: you meant that the description of the event must be
extended, similar to its description on the commit message? Or you don't
see its proper usage at all?
If the first is true, then I agree, the description can be improved, and
it will be useful, thanks! Will add it in the next version of the patch
soon.
On 2/16/24 11:17, Markus Armbruster wrote:
> Roman Khapov <rkhapov@yandex-team.ru> writes:
>
>> This commit adds the optional field reason for the events, which
>> contains the string, describing reason of status changing.
>> For example: reason of migration fail.
>>
>> Function migrate_set_state now accepts 4th argument: the reason to
>> pass to event. Every call of this function appended with NULL argument.
>>
>> Also migrate_set_state_err_reason was added to form reason from Error*
>>
>> Signed-off-by: Roman Khapov <rkhapov@yandex-team.ru>
> [...]
>
>> diff --git a/qapi/migration.json b/qapi/migration.json
>> index 5a565d9b8d..33bb5b7f50 100644
>> --- a/qapi/migration.json
>> +++ b/qapi/migration.json
>> @@ -1392,6 +1392,7 @@
>> # Emitted when a migration event happens
>> #
>> # @status: @MigrationStatus describing the current migration status.
>> +# @reason: Optional description of status changing reason.
> Intended use?
>
> When is it present?
>
>> #
>> # Since: 2.4
>> #
>> @@ -1402,7 +1403,7 @@
>> # "data": {"status": "completed"} }
>> ##
>> { 'event': 'MIGRATION',
>> - 'data': {'status': 'MigrationStatus'}}
>> + 'data': {'status': 'MigrationStatus', '*reason': 'str'}}
>>
>> ##
>> # @MIGRATION_PASS:
--
Best regards,
Roman
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/2] qapi/migration.json: add reason to MIGRATION event
2024-02-18 14:32 ` Roman Khapov
@ 2024-02-19 6:35 ` Markus Armbruster
2024-02-21 13:36 ` Roman Khapov
0 siblings, 1 reply; 12+ messages in thread
From: Markus Armbruster @ 2024-02-19 6:35 UTC (permalink / raw)
To: Roman Khapov; +Cc: qemu-devel, peterx, farosas, eblake, yc-core
Roman Khapov <rkhapov@yandex-team.ru> writes:
> To be clear: you meant that the description of the event must be extended, similar to its description on the commit message? Or you don't see its proper usage at all?
The commit message doesn't really tell me either why and how anybody
would use @reason. Once we have a common understanding there, improving
the doc comment should be straightforward.
> If the first is true, then I agree, the description can be improved, and it will be useful, thanks! Will add it in the next version of the patch soon.
>
> On 2/16/24 11:17, Markus Armbruster wrote:
>> Roman Khapov <rkhapov@yandex-team.ru> writes:
>>
>>> This commit adds the optional field reason for the events, which
>>> contains the string, describing reason of status changing.
>>> For example: reason of migration fail.
>>>
>>> Function migrate_set_state now accepts 4th argument: the reason to
>>> pass to event. Every call of this function appended with NULL argument.
>>>
>>> Also migrate_set_state_err_reason was added to form reason from Error*
>>>
>>> Signed-off-by: Roman Khapov <rkhapov@yandex-team.ru>
>>
>> [...]
>>
>>> diff --git a/qapi/migration.json b/qapi/migration.json
>>> index 5a565d9b8d..33bb5b7f50 100644
>>> --- a/qapi/migration.json
>>> +++ b/qapi/migration.json
>>> @@ -1392,6 +1392,7 @@
>>> # Emitted when a migration event happens
>>> #
>>> # @status: @MigrationStatus describing the current migration status.
>>> +# @reason: Optional description of status changing reason.
>>
>> Intended use?
>>
>> When is it present?
>>
>>> #
>>> # Since: 2.4
>>> #
>>> @@ -1402,7 +1403,7 @@
>>> # "data": {"status": "completed"} }
>>> ##
>>> { 'event': 'MIGRATION',
>>> - 'data': {'status': 'MigrationStatus'}}
>>> + 'data': {'status': 'MigrationStatus', '*reason': 'str'}}
>>> ##
>>> # @MIGRATION_PASS:
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/2] qapi/migration.json: add reason to MIGRATION event
2024-02-15 12:27 ` [PATCH v2 1/2] qapi/migration.json: add reason to " Roman Khapov
2024-02-16 6:17 ` Markus Armbruster
@ 2024-02-20 5:39 ` Peter Xu
2024-02-21 13:47 ` Roman Khapov
1 sibling, 1 reply; 12+ messages in thread
From: Peter Xu @ 2024-02-20 5:39 UTC (permalink / raw)
To: Roman Khapov; +Cc: qemu-devel, farosas, eblake, armbru, yc-core
On Thu, Feb 15, 2024 at 05:27:58PM +0500, Roman Khapov wrote:
> migrate_set_state(&mis->state, MIGRATION_STATUS_COLO,
> - MIGRATION_STATUS_COMPLETED);
> + MIGRATION_STATUS_COMPLETED, NULL);
Instead of enforcing migrate_set_error() to always pass an error, maybe we
can allow migrate_generate_event() to fetch s->error in FAILED state, if
that hint ever existed?
--
Peter Xu
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/2] qapi/migration.json: add reason to MIGRATION event
2024-02-19 6:35 ` Markus Armbruster
@ 2024-02-21 13:36 ` Roman Khapov
0 siblings, 0 replies; 12+ messages in thread
From: Roman Khapov @ 2024-02-21 13:36 UTC (permalink / raw)
To: Markus Armbruster; +Cc: qemu-devel, peterx, farosas, eblake, yc-core
The reason in MIGRATION event can be used, when there is some helpful
message, that can be added to improve debugging\understanding of the
reason of migration status changing.
I propose the next usage - when qemu sends (MIGRATION status=failed)
event, the error message describing the problem can be sent in that
event too. This is more comfortable way to understand the problem
comparing to reading qemu logs, especially in some cases, where qemu are
launched and monitored through various automation processes (examples:
cloud environments, some 'integration' tests of the qemu, etc).
Probably, there are some other cases when the accompanying message can
be useful, not only in failed migrations, but now I don't know that cases.
If we have common understanding with that, I will improve the doc
comment and the commit message of the first path too in the next version
of the series.
On 2/19/24 11:35, Markus Armbruster wrote:
> Roman Khapov <rkhapov@yandex-team.ru> writes:
>
>> To be clear: you meant that the description of the event must be extended, similar to its description on the commit message? Or you don't see its proper usage at all?
> The commit message doesn't really tell me either why and how anybody
> would use @reason. Once we have a common understanding there, improving
> the doc comment should be straightforward.
--
Best regards,
Roman Khapov
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/2] qapi/migration.json: add reason to MIGRATION event
2024-02-20 5:39 ` Peter Xu
@ 2024-02-21 13:47 ` Roman Khapov
2024-02-22 9:22 ` Peter Xu
0 siblings, 1 reply; 12+ messages in thread
From: Roman Khapov @ 2024-02-21 13:47 UTC (permalink / raw)
To: Peter Xu; +Cc: qemu-devel, farosas, eblake, armbru, yc-core
If I understood you right, you suggest to improve migrate_generate_event
to accept MigrationState* instead of int* state (which is pointing to
field MigrationState.state in all usages), and get error reason from
MigrationState.error, if the new state is MIGRATION_STATE_FAILED, is it?
That sounds reasonable, thanks!
But I'm not sure if I got the idea of changing migrate_set_error
correctly, can you explain in more details, please?
On 2/20/24 10:39, Peter Xu wrote:
> On Thu, Feb 15, 2024 at 05:27:58PM +0500, Roman Khapov wrote:
>> migrate_set_state(&mis->state, MIGRATION_STATUS_COLO,
>> - MIGRATION_STATUS_COMPLETED);
>> + MIGRATION_STATUS_COMPLETED, NULL);
> Instead of enforcing migrate_set_error() to always pass an error, maybe we
> can allow migrate_generate_event() to fetch s->error in FAILED state, if
> that hint ever existed?
>
--
Best regards,
Roman Khapov
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 0/2] Field 'reason' for MIGRATION event
2024-02-15 12:27 [PATCH v2 0/2] Field 'reason' for MIGRATION event Roman Khapov
2024-02-15 12:27 ` [PATCH v2 1/2] qapi/migration.json: add reason to " Roman Khapov
2024-02-15 12:27 ` [PATCH v2 2/2] migration: add error reason for failed MIGRATION events Roman Khapov
@ 2024-02-21 14:45 ` Fabiano Rosas
2024-02-22 7:01 ` Markus Armbruster
2 siblings, 1 reply; 12+ messages in thread
From: Fabiano Rosas @ 2024-02-21 14:45 UTC (permalink / raw)
To: Roman Khapov, qemu-devel; +Cc: peterx, eblake, armbru, yc-core, Roman Khapov
Roman Khapov <rkhapov@yandex-team.ru> writes:
Hi Roman,
> This is resending of series 20240215082659.1378342-1-rkhapov@yandex-team.ru,
> where patch subjects numbers were broken in patch 2/2.
>
> Sometimes, when migration fails, it is hard to find out
> the cause of the problems: you have to grep qemu logs.
> At the same time, there is MIGRATION event, which looks like
> suitable place to hold such error descriptions.
query-migrate after the event is received should be enough for giving
you the failure reason. We have that in error-desc. See commit
c94143e587 ("migration: Display error in query-migrate irrelevant of
status").
>
> To handle situation like this (maybe one day it will be useful
> for other MIGRATION statuses to have additional 'reason' strings),
I find it unlikely. There's no "reason" for making progress except
that's how things work. Only the exceptional (i.e. failure) statuses
would have a reason. Today that's FAILED only, maybe also
POSTCOPY_PAUSED.
> the general optional field 'reason' can be added.
>
> The series proposes next changes:
>
> 1. Adding optional 'reason' field of type str into
> qapi/migration.json MIGRATION event
>
> 2. Passing some error description as reason for every place, which
> sets migration state to MIGRATION_STATUS_FAILED
>
> After the series, MIGRATION event will looks like this:
> {"execute": "qmp_capabilities"}
> {"return": {}}
> {"event": "MIGRATION", "data": {"status": "setup"}}
> {"event": "MIGRATION", "data": {"status": "failed", "reason": "Failed to connect to '/tmp/sock.sock': No such file or directory"}}
>
> Roman Khapov (2):
> qapi/migration.json: add reason to MIGRATION event
> migration: add error reason for failed MIGRATION events
>
> migration/colo.c | 6 +-
> migration/migration.c | 128 ++++++++++++++++++++++++++++--------------
> migration/migration.h | 5 +-
> migration/multifd.c | 10 ++--
> migration/savevm.c | 24 ++++----
> qapi/migration.json | 3 +-
> 6 files changed, 112 insertions(+), 64 deletions(-)
Please remember to run make check:
380/383 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test ERROR
104.77s killed by signal 6 SIGABRT
―――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――
stderr: Broken pipe ../tests/qtest/libqtest.c:204: kill_qemu() detected
QEMU death from signal 11 (Segmentation fault) (core dumped)
Most likely one of the new error_setg has broken postcopy recovery. Some
of those paths are not intended to trigger cleanup.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 0/2] Field 'reason' for MIGRATION event
2024-02-21 14:45 ` [PATCH v2 0/2] Field 'reason' for MIGRATION event Fabiano Rosas
@ 2024-02-22 7:01 ` Markus Armbruster
0 siblings, 0 replies; 12+ messages in thread
From: Markus Armbruster @ 2024-02-22 7:01 UTC (permalink / raw)
To: Fabiano Rosas; +Cc: Roman Khapov, qemu-devel, peterx, eblake, yc-core
Fabiano Rosas <farosas@suse.de> writes:
> Roman Khapov <rkhapov@yandex-team.ru> writes:
>
> Hi Roman,
>
>> This is resending of series 20240215082659.1378342-1-rkhapov@yandex-team.ru,
>> where patch subjects numbers were broken in patch 2/2.
>>
>> Sometimes, when migration fails, it is hard to find out
>> the cause of the problems: you have to grep qemu logs.
>> At the same time, there is MIGRATION event, which looks like
>> suitable place to hold such error descriptions.
>
> query-migrate after the event is received should be enough for giving
> you the failure reason. We have that in error-desc. See commit
> c94143e587 ("migration: Display error in query-migrate irrelevant of
> status").
>
>>
>> To handle situation like this (maybe one day it will be useful
>> for other MIGRATION statuses to have additional 'reason' strings),
>
> I find it unlikely. There's no "reason" for making progress except
> that's how things work. Only the exceptional (i.e. failure) statuses
> would have a reason. Today that's FAILED only, maybe also
> POSTCOPY_PAUSED.
I can't see a need for the proposed feature then.
>> the general optional field 'reason' can be added.
[...]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/2] qapi/migration.json: add reason to MIGRATION event
2024-02-21 13:47 ` Roman Khapov
@ 2024-02-22 9:22 ` Peter Xu
0 siblings, 0 replies; 12+ messages in thread
From: Peter Xu @ 2024-02-22 9:22 UTC (permalink / raw)
To: Roman Khapov; +Cc: qemu-devel, farosas, eblake, armbru, yc-core
On Wed, Feb 21, 2024 at 06:47:47PM +0500, Roman Khapov wrote:
> If I understood you right, you suggest to improve migrate_generate_event to
> accept MigrationState* instead of int* state (which is pointing to field
> MigrationState.state in all usages), and get error reason from
> MigrationState.error, if the new state is MIGRATION_STATE_FAILED, is it?
migrate_generate_event() is a migration internal function, it can directly
reference s->error with error_mutex held.
>
> That sounds reasonable, thanks!
>
> But I'm not sure if I got the idea of changing migrate_set_error correctly,
> can you explain in more details, please?
I fat-fingered.. sorry. I wanted to say migrate_set_state() below, and I
think migrate_set_state() can be kept untouched.
But please consider the other reviewer's comment first: if query-migrate
(or HMP "info migrate") works for you, then this interface is not needed.
>
> On 2/20/24 10:39, Peter Xu wrote:
> > On Thu, Feb 15, 2024 at 05:27:58PM +0500, Roman Khapov wrote:
> > > migrate_set_state(&mis->state, MIGRATION_STATUS_COLO,
> > > - MIGRATION_STATUS_COMPLETED);
> > > + MIGRATION_STATUS_COMPLETED, NULL);
> > Instead of enforcing migrate_set_error() to always pass an error, maybe we
> > can allow migrate_generate_event() to fetch s->error in FAILED state, if
> > that hint ever existed?
> >
> --
> Best regards,
> Roman Khapov
>
--
Peter Xu
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2024-02-22 9:23 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-02-15 12:27 [PATCH v2 0/2] Field 'reason' for MIGRATION event Roman Khapov
2024-02-15 12:27 ` [PATCH v2 1/2] qapi/migration.json: add reason to " Roman Khapov
2024-02-16 6:17 ` Markus Armbruster
2024-02-18 14:32 ` Roman Khapov
2024-02-19 6:35 ` Markus Armbruster
2024-02-21 13:36 ` Roman Khapov
2024-02-20 5:39 ` Peter Xu
2024-02-21 13:47 ` Roman Khapov
2024-02-22 9:22 ` Peter Xu
2024-02-15 12:27 ` [PATCH v2 2/2] migration: add error reason for failed MIGRATION events Roman Khapov
2024-02-21 14:45 ` [PATCH v2 0/2] Field 'reason' for MIGRATION event Fabiano Rosas
2024-02-22 7:01 ` Markus Armbruster
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).