All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] migration: Fix blocking in POSTCOPY_DEVICE during package load
@ 2026-04-21  5:22 Pranav Tyagi
  2026-04-21 21:25 ` Peter Xu
  0 siblings, 1 reply; 3+ messages in thread
From: Pranav Tyagi @ 2026-04-21  5:22 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Fabiano Rosas, Juraj Marcin, Prasad Pandit,
	Pranav Tyagi

The package_loaded event is not set in case MIG_RP_MSG_PONG does not
arrive on the source from the destination in the return path thread. The
migration thread would then be blocked waiting for package_loaded event
indefinitely in POSTCOPY_DEVICE state. Where as, in such a condition the
source VM can safely resume as the destination has not yet started. The
pong message can get lost in case of a network failure or destination
crash before sending the pong.

This patch uses the error detected in case of network failure or
destination crash to set the package_loaded event in the out path of the
return path thread. This will kick the migration thread out from
a condition of indefinitely waiting for the package_loaded event. The
migration thread then fails early and breaks from the migration loop to
resume the VM on the source side.

Fixes: 7b842fe354c6 ("migration: Introduce POSTCOPY_DEVICE state")
Signed-off-by: Pranav Tyagi <prtyagi@redhat.com>
---
 migration/migration.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/migration/migration.c b/migration/migration.c
index 5c9aaa6e58..1656c1203c 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2386,6 +2386,15 @@ out:
     if (err) {
         migrate_error_propagate(ms, err);
         trace_source_return_path_thread_bad_end();
+        if (ms->state == MIGRATION_STATUS_POSTCOPY_DEVICE) {
+            /*
+             * Kick the migration thread if it gets stuck in
+             * POSTCOPY_DEVICE state waiting for
+             * postcopy_package_loaded_event. The event will never be
+             * set as MIG_RP_MSG_PONG from the destination is lost.
+             */
+            qemu_event_set(&ms->postcopy_package_loaded_event);
+        }
     }
 
     if (ms->state == MIGRATION_STATUS_POSTCOPY_RECOVER) {
@@ -3232,6 +3241,17 @@ static MigIterateState migration_iteration_run(MigrationState *s)
              * package before actually completing.
              */
             qemu_event_wait(&s->postcopy_package_loaded_event);
+            /*
+             * Check for errors in case the migration thread was stuck in
+             * POSTCOPY_DEVICE state waiting for the
+             * postcopy_package_loaded_event which was never set.
+             * If so, fail now and break out of the iteration.
+             */
+            if (migrate_has_error(s)) {
+                migrate_set_state(&s->state, MIGRATION_STATUS_POSTCOPY_DEVICE,
+                                  MIGRATION_STATUS_FAILING);
+                return MIG_ITERATE_BREAK;
+            }
             migrate_set_state(&s->state, MIGRATION_STATUS_POSTCOPY_DEVICE,
                               MIGRATION_STATUS_POSTCOPY_ACTIVE);
         }
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-04-22  8:50 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-21  5:22 [PATCH] migration: Fix blocking in POSTCOPY_DEVICE during package load Pranav Tyagi
2026-04-21 21:25 ` Peter Xu
2026-04-22  8:49   ` Pranav Tyagi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.