qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] migration: Fix state transition in postcopy_start() error handling
@ 2025-08-26 11:51 Juraj Marcin
  2025-08-26 18:23 ` Peter Xu
  2025-09-27 14:01 ` Michael Tokarev
  0 siblings, 2 replies; 5+ messages in thread
From: Juraj Marcin @ 2025-08-26 11:51 UTC (permalink / raw)
  To: qemu-devel; +Cc: Juraj Marcin, Fabiano Rosas, Peter Xu, qemu-stable

From: Juraj Marcin <jmarcin@redhat.com>

Commit 48814111366b ("migration: Always set DEVICE state") introduced
DEVICE state to postcopy, which moved the actual state transition that
leads to POSTCOPY_ACTIVE.

However, the error handling part of the postcopy_start() function still
expects the state POSTCOPY_ACTIVE, but depending on where an error
happens, now the state can be either ACTIVE, DEVICE or CANCELLING, but
never POSTCOPY_ACTIVE, as this transition now happens just before a
successful return from the function.

Instead, accept any state except CANCELLING when transitioning to FAILED
state.

Cc: qemu-stable@nongnu.org
Fixes: 48814111366b ("migration: Always set DEVICE state")
Signed-off-by: Juraj Marcin <jmarcin@redhat.com>

---
In the RFC[1] where this patch was discussed, there was also a
suggestion for a helper function migrate_set_failure() that would check
if the state is not CANCELLING and then set migration error and FAILED
state. I discussed the implementation with Peter, and we came to a
conclusion that instead of patching such clean-up on top of the current
error handling code, it might be more useful to do a larger refactor and
clean-up of all error handling in the migration code.

Such clean-up should reduce the number of places where we need to
explicitly transition to a FAILED state (ideally to one, or only a
couple of places), and instead only set an appropriate migration error
using migrate_set_error(). Additionally, it would also refactor
inappropriate uses of QEMUFile errors where the error is not really an
error of the underlying channel and migrate_set_error() should be used
instead.

[1]: https://lore.kernel.org/all/20250807114922.1013286-3-jmarcin@redhat.com/
---
 migration/migration.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 10c216d25d..32b8ce5613 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2872,8 +2872,9 @@ static int postcopy_start(MigrationState *ms, Error **errp)
 fail_closefb:
     qemu_fclose(fb);
 fail:
-    migrate_set_state(&ms->state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
-                          MIGRATION_STATUS_FAILED);
+    if (ms->state != MIGRATION_STATUS_CANCELLING) {
+        migrate_set_state(&ms->state, ms->state, MIGRATION_STATUS_FAILED);
+    }
     migration_block_activate(NULL);
     migration_call_notifiers(ms, MIG_EVENT_PRECOPY_FAILED, NULL);
     bql_unlock();
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-09-29 15:48 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-26 11:51 [PATCH] migration: Fix state transition in postcopy_start() error handling Juraj Marcin
2025-08-26 18:23 ` Peter Xu
2025-08-26 19:00   ` Fabiano Rosas
2025-09-27 14:01 ` Michael Tokarev
2025-09-29 15:47   ` Peter Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).