* [PATCH] migration: Attempt disk reactivation in more failure scenarios
@ 2023-05-02 20:52 Eric Blake
2023-05-02 21:17 ` Peter Xu
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Eric Blake @ 2023-05-02 20:52 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Juan Quintela, Peter Xu, Leonardo Bras
Commit fe904ea824 added a fail_inactivate label, which tries to
reactivate disks on the source after a failure while s->state ==
MIGRATION_STATUS_ACTIVE, but didn't actually use the label if
qemu_savevm_state_complete_precopy() failed. This failure to
reactivate is also present in commit 6039dd5b1c (also covering the new
s->state == MIGRATION_STATUS_DEVICE state) and 403d18ae (ensuring
s->block_inactive is set more reliably).
Consolidate the two labels back into one - no matter HOW migration is
failed, if there is any chance we can reach vm_start() after having
attempted inactivation, it is essential that we have tried to restart
disks before then. This also makes the cleanup more like
migrate_fd_cancel().
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
---
migration/migration.c | 24 ++++++++++++++----------
1 file changed, 14 insertions(+), 10 deletions(-)
diff --git a/migration/migration.c b/migration/migration.c
index abcadbb619e..7f982bd2c80 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2299,6 +2299,11 @@ static void migration_completion(MigrationState *s)
MIGRATION_STATUS_DEVICE);
}
if (ret >= 0) {
+ /*
+ * Inactivate disks except in COLO, and track that we
+ * have done so in order to remember to reactivate
+ * them if migration fails or is cancelled.
+ */
s->block_inactive = !migrate_colo();
qemu_file_set_rate_limit(s->to_dst_file, INT64_MAX);
ret = qemu_savevm_state_complete_precopy(s->to_dst_file, false,
@@ -2343,13 +2348,13 @@ static void migration_completion(MigrationState *s)
rp_error = await_return_path_close_on_source(s);
trace_migration_return_path_end_after(rp_error);
if (rp_error) {
- goto fail_invalidate;
+ goto fail;
}
}
if (qemu_file_get_error(s->to_dst_file)) {
trace_migration_completion_file_err();
- goto fail_invalidate;
+ goto fail;
}
if (migrate_colo() && s->state == MIGRATION_STATUS_ACTIVE) {
@@ -2363,26 +2368,25 @@ static void migration_completion(MigrationState *s)
return;
-fail_invalidate:
- /* If not doing postcopy, vm_start() will be called: let's regain
- * control on images.
- */
- if (s->state == MIGRATION_STATUS_ACTIVE ||
- s->state == MIGRATION_STATUS_DEVICE) {
+fail:
+ if (s->block_inactive && (s->state == MIGRATION_STATUS_ACTIVE ||
+ s->state == MIGRATION_STATUS_DEVICE)) {
+ /*
+ * If not doing postcopy, vm_start() will be called: let's
+ * regain control on images.
+ */
Error *local_err = NULL;
qemu_mutex_lock_iothread();
bdrv_activate_all(&local_err);
if (local_err) {
error_report_err(local_err);
- s->block_inactive = true;
} else {
s->block_inactive = false;
}
qemu_mutex_unlock_iothread();
}
-fail:
migrate_set_state(&s->state, current_active_state,
MIGRATION_STATUS_FAILED);
}
base-commit: b5f47ba73b7c1457d2f18d71c00e1a91a76fe60b
--
2.40.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] migration: Attempt disk reactivation in more failure scenarios
2023-05-02 20:52 [PATCH] migration: Attempt disk reactivation in more failure scenarios Eric Blake
@ 2023-05-02 21:17 ` Peter Xu
2023-05-08 10:42 ` Kevin Wolf
2023-05-09 15:36 ` Juan Quintela
2 siblings, 0 replies; 4+ messages in thread
From: Peter Xu @ 2023-05-02 21:17 UTC (permalink / raw)
To: Eric Blake; +Cc: qemu-devel, kwolf, Juan Quintela, Leonardo Bras
On Tue, May 02, 2023 at 03:52:12PM -0500, Eric Blake wrote:
> Commit fe904ea824 added a fail_inactivate label, which tries to
> reactivate disks on the source after a failure while s->state ==
> MIGRATION_STATUS_ACTIVE, but didn't actually use the label if
> qemu_savevm_state_complete_precopy() failed. This failure to
> reactivate is also present in commit 6039dd5b1c (also covering the new
> s->state == MIGRATION_STATUS_DEVICE state) and 403d18ae (ensuring
> s->block_inactive is set more reliably).
>
> Consolidate the two labels back into one - no matter HOW migration is
> failed, if there is any chance we can reach vm_start() after having
> attempted inactivation, it is essential that we have tried to restart
> disks before then. This also makes the cleanup more like
> migrate_fd_cancel().
>
> Suggested-by: Kevin Wolf <kwolf@redhat.com>
> Signed-off-by: Eric Blake <eblake@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
--
Peter Xu
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] migration: Attempt disk reactivation in more failure scenarios
2023-05-02 20:52 [PATCH] migration: Attempt disk reactivation in more failure scenarios Eric Blake
2023-05-02 21:17 ` Peter Xu
@ 2023-05-08 10:42 ` Kevin Wolf
2023-05-09 15:36 ` Juan Quintela
2 siblings, 0 replies; 4+ messages in thread
From: Kevin Wolf @ 2023-05-08 10:42 UTC (permalink / raw)
To: Eric Blake; +Cc: qemu-devel, Juan Quintela, Peter Xu, Leonardo Bras
Am 02.05.2023 um 22:52 hat Eric Blake geschrieben:
> Commit fe904ea824 added a fail_inactivate label, which tries to
> reactivate disks on the source after a failure while s->state ==
> MIGRATION_STATUS_ACTIVE, but didn't actually use the label if
> qemu_savevm_state_complete_precopy() failed. This failure to
> reactivate is also present in commit 6039dd5b1c (also covering the new
> s->state == MIGRATION_STATUS_DEVICE state) and 403d18ae (ensuring
> s->block_inactive is set more reliably).
>
> Consolidate the two labels back into one - no matter HOW migration is
> failed, if there is any chance we can reach vm_start() after having
> attempted inactivation, it is essential that we have tried to restart
> disks before then. This also makes the cleanup more like
> migrate_fd_cancel().
>
> Suggested-by: Kevin Wolf <kwolf@redhat.com>
> Signed-off-by: Eric Blake <eblake@redhat.com>
Thanks, applied to the block branch.
Kevin
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] migration: Attempt disk reactivation in more failure scenarios
2023-05-02 20:52 [PATCH] migration: Attempt disk reactivation in more failure scenarios Eric Blake
2023-05-02 21:17 ` Peter Xu
2023-05-08 10:42 ` Kevin Wolf
@ 2023-05-09 15:36 ` Juan Quintela
2 siblings, 0 replies; 4+ messages in thread
From: Juan Quintela @ 2023-05-09 15:36 UTC (permalink / raw)
To: Eric Blake; +Cc: qemu-devel, kwolf, Peter Xu, Leonardo Bras
Eric Blake <eblake@redhat.com> wrote:
> Commit fe904ea824 added a fail_inactivate label, which tries to
> reactivate disks on the source after a failure while s->state ==
> MIGRATION_STATUS_ACTIVE, but didn't actually use the label if
> qemu_savevm_state_complete_precopy() failed. This failure to
> reactivate is also present in commit 6039dd5b1c (also covering the new
> s->state == MIGRATION_STATUS_DEVICE state) and 403d18ae (ensuring
> s->block_inactive is set more reliably).
>
> Consolidate the two labels back into one - no matter HOW migration is
> failed, if there is any chance we can reach vm_start() after having
> attempted inactivation, it is essential that we have tried to restart
> disks before then. This also makes the cleanup more like
> migrate_fd_cancel().
>
> Suggested-by: Kevin Wolf <kwolf@redhat.com>
> Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
I still can't believe that power down disks and decide if restart (or
not) the vm is such a complicated bussiness. Sniff.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-05-09 15:36 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-05-02 20:52 [PATCH] migration: Attempt disk reactivation in more failure scenarios Eric Blake
2023-05-02 21:17 ` Peter Xu
2023-05-08 10:42 ` Kevin Wolf
2023-05-09 15:36 ` Juan Quintela
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).