qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Juan Quintela <quintela@redhat.com>
To: qemu-devel@nongnu.org
Cc: "Eric Farman" <farman@linux.ibm.com>,
	"Laurent Vivier" <lvivier@redhat.com>,
	"David Gibson" <david@gibson.dropbear.id.au>,
	qemu-block@nongnu.org, "Stefan Hajnoczi" <stefanha@redhat.com>,
	"Marcel Apfelbaum" <marcel.apfelbaum@gmail.com>,
	"Kevin Wolf" <kwolf@redhat.com>,
	"David Hildenbrand" <david@redhat.com>,
	"Samuel Thibault" <samuel.thibault@ens-lyon.org>,
	qemu-s390x@nongnu.org,
	"Richard Henderson" <richard.henderson@linaro.org>,
	"Leonardo Bras" <leobras@redhat.com>,
	"Corey Minyard" <cminyard@mvista.com>,
	"Ilya Leoshkevich" <iii@linux.ibm.com>,
	"Marc-André Lureau" <marcandre.lureau@redhat.com>,
	"Markus Armbruster" <armbru@redhat.com>,
	"Christian Borntraeger" <borntraeger@linux.ibm.com>,
	"Eduardo Habkost" <eduardo@habkost.net>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	qemu-ppc@nongnu.org, "Michael S. Tsirkin" <mst@redhat.com>,
	"Peter Maydell" <peter.maydell@linaro.org>,
	"Halil Pasic" <pasic@linux.ibm.com>,
	"Gerd Hoffmann" <kraxel@redhat.com>,
	"Cédric Le Goater" <clg@kaod.org>,
	"Li Zhijian" <lizhijian@fujitsu.com>,
	"Eric Blake" <eblake@redhat.com>,
	"Denis V. Lunev" <den@openvz.org>,
	"Hanna Reitz" <hreitz@redhat.com>,
	"Fabiano Rosas" <farosas@suse.de>,
	"Stefan Berger" <stefanb@linux.vnet.ibm.com>,
	qemu-arm@nongnu.org,
	"Daniel Henrique Barboza" <danielhb413@gmail.com>,
	"Thomas Huth" <thuth@redhat.com>,
	"Corey Minyard" <minyard@acm.org>, "John Snow" <jsnow@redhat.com>,
	"Jeff Cody" <codyprime@gmail.com>, "Peter Xu" <peterx@redhat.com>,
	"Nicholas Piggin" <npiggin@gmail.com>,
	"Juan Quintela" <quintela@redhat.com>,
	"Harsh Prateek Bora" <harshpb@linux.ibm.com>,
	"Jason Wang" <jasowang@redhat.com>,
	"Daniel P. Berrangé" <berrange@redhat.com>,
	"Stefan Weil" <sw@weilnetz.de>,
	"Mark Cave-Ayland" <mark.cave-ayland@ilande.co.uk>,
	"Fam Zheng" <fam@euphon.net>, "Xiaohui Li" <xiaohli@redhat.com>
Subject: [PULL 27/40] migration: Allow network to fail even during recovery
Date: Thu,  2 Nov 2023 12:40:41 +0100	[thread overview]
Message-ID: <20231102114054.44360-28-quintela@redhat.com> (raw)
In-Reply-To: <20231102114054.44360-1-quintela@redhat.com>

From: Peter Xu <peterx@redhat.com>

Normally the postcopy recover phase should only exist for a super short
period, that's the duration when QEMU is trying to recover from an
interrupted postcopy migration, during which handshake will be carried out
for continuing the procedure with state changes from PAUSED -> RECOVER ->
POSTCOPY_ACTIVE again.

Here RECOVER phase should be super small, that happens right after the
admin specified a new but working network link for QEMU to reconnect to
dest QEMU.

However there can still be case where the channel is broken in this small
RECOVER window.

If it happens, with current code there's no way the src QEMU can got kicked
out of RECOVER stage. No way either to retry the recover in another channel
when established.

This patch allows the RECOVER phase to fail itself too - we're mostly
ready, just some small things missing, e.g. properly kick the main
migration thread out when sleeping on rp_sem when we found that we're at
RECOVER stage.  When this happens, it fails the RECOVER itself, and
rollback to PAUSED stage.  Then the user can retry another round of
recovery.

To make it even stronger, teach QMP command migrate-pause to explicitly
kick src/dst QEMU out when needed, so even if for some reason the migration
thread didn't got kicked out already by a failing rethrn-path thread, the
admin can also kick it out.

This will be an super, super corner case, but still try to cover that.

One can try to test this with two proxy channels for migration:

  (a) socat unix-listen:/tmp/src.sock,reuseaddr,fork tcp:localhost:10000
  (b) socat tcp-listen:10000,reuseaddr,fork unix:/tmp/dst.sock

So the migration channel will be:

                      (a)          (b)
  src -> /tmp/src.sock -> tcp:10000 -> /tmp/dst.sock -> dst

Then to make QEMU hang at RECOVER stage, one can do below:

  (1) stop the postcopy using QMP command postcopy-pause
  (2) kill the 2nd proxy (b)
  (3) try to recover the postcopy using /tmp/src.sock on src
  (4) src QEMU will go into RECOVER stage but won't be able to continue
      from there, because the channel is actually broken at (b)

Before this patch, step (4) will make src QEMU stuck in RECOVER stage,
without a way to kick the QEMU out or continue the postcopy again.  After
this patch, (4) will quickly fail qemu and bounce back to PAUSED stage.

Admin can also kick QEMU from (4) into PAUSED when needed using
migrate-pause when needed.

After bouncing back to PAUSED stage, one can recover again.

Reported-by: Xiaohui Li <xiaohli@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2111332
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231017202633.296756-3-peterx@redhat.com>
---
 migration/migration.h |  8 ++++--
 migration/migration.c | 63 +++++++++++++++++++++++++++++++++++++++----
 migration/ram.c       |  4 ++-
 3 files changed, 67 insertions(+), 8 deletions(-)

diff --git a/migration/migration.h b/migration/migration.h
index 615b517594..af8c965b7f 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -494,6 +494,7 @@ int migrate_init(MigrationState *s, Error **errp);
 bool migration_is_blocked(Error **errp);
 /* True if outgoing migration has entered postcopy phase */
 bool migration_in_postcopy(void);
+bool migration_postcopy_is_alive(int state);
 MigrationState *migrate_get_current(void);
 
 uint64_t ram_get_total_transferred_pages(void);
@@ -534,8 +535,11 @@ void migration_populate_vfio_info(MigrationInfo *info);
 void migration_reset_vfio_bytes_transferred(void);
 void postcopy_temp_page_reset(PostcopyTmpPage *tmp_page);
 
-/* Migration thread waiting for return path thread. */
-void migration_rp_wait(MigrationState *s);
+/*
+ * Migration thread waiting for return path thread.  Return non-zero if an
+ * error is detected.
+ */
+int migration_rp_wait(MigrationState *s);
 /*
  * Kick the migration thread waiting for return path messages.  NOTE: the
  * name can be slightly confusing (when read as "kick the rp thread"), just
diff --git a/migration/migration.c b/migration/migration.c
index 455ddc896a..e875ea0d6b 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1393,6 +1393,17 @@ bool migration_in_postcopy(void)
     }
 }
 
+bool migration_postcopy_is_alive(int state)
+{
+    switch (state) {
+    case MIGRATION_STATUS_POSTCOPY_ACTIVE:
+    case MIGRATION_STATUS_POSTCOPY_RECOVER:
+        return true;
+    default:
+        return false;
+    }
+}
+
 bool migration_in_postcopy_after_devices(MigrationState *s)
 {
     return migration_in_postcopy() && s->postcopy_after_devices;
@@ -1673,8 +1684,15 @@ void qmp_migrate_pause(Error **errp)
     MigrationIncomingState *mis = migration_incoming_get_current();
     int ret = 0;
 
-    if (ms->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
+    if (migration_postcopy_is_alive(ms->state)) {
         /* Source side, during postcopy */
+        Error *error = NULL;
+
+        /* Tell the core migration that we're pausing */
+        error_setg(&error, "Postcopy migration is paused by the user");
+        migrate_set_error(ms, error);
+        error_free(error);
+
         qemu_mutex_lock(&ms->qemu_file_lock);
         if (ms->to_dst_file) {
             ret = qemu_file_shutdown(ms->to_dst_file);
@@ -1683,10 +1701,17 @@ void qmp_migrate_pause(Error **errp)
         if (ret) {
             error_setg(errp, "Failed to pause source migration");
         }
+
+        /*
+         * Kick the migration thread out of any waiting windows (on behalf
+         * of the rp thread).
+         */
+        migration_rp_kick(ms);
+
         return;
     }
 
-    if (mis->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
+    if (migration_postcopy_is_alive(mis->state)) {
         ret = qemu_file_shutdown(mis->from_src_file);
         if (ret) {
             error_setg(errp, "Failed to pause destination migration");
@@ -1695,7 +1720,7 @@ void qmp_migrate_pause(Error **errp)
     }
 
     error_setg(errp, "migrate-pause is currently only supported "
-               "during postcopy-active state");
+               "during postcopy-active or postcopy-recover state");
 }
 
 bool migration_is_blocked(Error **errp)
@@ -1882,9 +1907,21 @@ void qmp_migrate_continue(MigrationStatus state, Error **errp)
     qemu_sem_post(&s->pause_sem);
 }
 
-void migration_rp_wait(MigrationState *s)
+int migration_rp_wait(MigrationState *s)
 {
+    /* If migration has failure already, ignore the wait */
+    if (migrate_has_error(s)) {
+        return -1;
+    }
+
     qemu_sem_wait(&s->rp_state.rp_sem);
+
+    /* After wait, double check that there's no failure */
+    if (migrate_has_error(s)) {
+        return -1;
+    }
+
+    return 0;
 }
 
 void migration_rp_kick(MigrationState *s)
@@ -2146,6 +2183,20 @@ out:
         trace_source_return_path_thread_bad_end();
     }
 
+    if (ms->state == MIGRATION_STATUS_POSTCOPY_RECOVER) {
+        /*
+         * this will be extremely unlikely: that we got yet another network
+         * issue during recovering of the 1st network failure.. during this
+         * period the main migration thread can be waiting on rp_sem for
+         * this thread to sync with the other side.
+         *
+         * When this happens, explicitly kick the migration thread out of
+         * RECOVER stage and back to PAUSED, so the admin can try
+         * everything again.
+         */
+        migration_rp_kick(ms);
+    }
+
     trace_source_return_path_thread_end();
     rcu_unregister_thread();
 
@@ -2611,7 +2662,9 @@ static int postcopy_resume_handshake(MigrationState *s)
     qemu_savevm_send_postcopy_resume(s->to_dst_file);
 
     while (s->state == MIGRATION_STATUS_POSTCOPY_RECOVER) {
-        migration_rp_wait(s);
+        if (migration_rp_wait(s)) {
+            return -1;
+        }
     }
 
     if (s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
diff --git a/migration/ram.c b/migration/ram.c
index d05ffddbc8..929cba08f4 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -4099,7 +4099,9 @@ static int ram_dirty_bitmap_sync_all(MigrationState *s, RAMState *rs)
 
     /* Wait until all the ramblocks' dirty bitmap synced */
     while (qatomic_read(&rs->postcopy_bmap_sync_requested)) {
-        migration_rp_wait(s);
+        if (migration_rp_wait(s)) {
+            return -1;
+        }
     }
 
     trace_ram_dirty_bitmap_sync_complete();
-- 
2.41.0



  parent reply	other threads:[~2023-11-02 11:45 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-02 11:40 [PULL 00/40] Migration 20231102 patches Juan Quintela
2023-11-02 11:40 ` [PULL 01/40] hw/ipmi: Don't call vmstate_register() from instance_init() functions Juan Quintela
2023-11-02 11:40 ` [PULL 02/40] hw/s390x/s390-skeys: Don't call register_savevm_live() during instance_init() Juan Quintela
2023-11-02 11:40 ` [PULL 03/40] hw/s390x/s390-stattrib: Simplify handling of the "migration-enabled" property Juan Quintela
2023-11-02 11:40 ` [PULL 04/40] hw/s390x/s390-stattrib: Don't call register_savevm_live() during instance_init() Juan Quintela
2023-11-02 11:40 ` [PULL 05/40] migration: Create vmstate_register_any() Juan Quintela
2023-11-02 11:40 ` [PULL 06/40] migration: Use vmstate_register_any() Juan Quintela
2023-11-02 11:40 ` [PULL 07/40] migration: Use vmstate_register_any() for isa-ide Juan Quintela
2023-11-02 11:40 ` [PULL 08/40] migration: Use VMSTATE_INSTANCE_ID_ANY for slirp Juan Quintela
2023-11-02 11:40 ` [PULL 09/40] migration: Hack to maintain backwards compatibility for ppc Juan Quintela
2023-11-02 11:40 ` [PULL 10/40] migration: Check in savevm_state_handler_insert for dups Juan Quintela
2023-11-02 11:40 ` [PULL 11/40] migration: Improve example and documentation of vmstate_register() Juan Quintela
2023-11-02 11:40 ` [PULL 12/40] migration: Use vmstate_register_any() for audio Juan Quintela
2023-11-02 11:40 ` [PULL 13/40] migration: Use vmstate_register_any() for eeprom93xx Juan Quintela
2023-11-02 11:40 ` [PULL 14/40] migration: Use vmstate_register_any() for vmware_vga Juan Quintela
2023-11-02 11:40 ` [PULL 15/40] migration: Set downtime_start even for postcopy Juan Quintela
2023-11-02 11:40 ` [PULL 16/40] migration: Add migration_downtime_start|end() helpers Juan Quintela
2023-11-02 11:40 ` [PULL 17/40] migration: Add per vmstate downtime tracepoints Juan Quintela
2023-11-02 11:40 ` [PULL 18/40] migration: migration_stop_vm() helper Juan Quintela
2023-11-02 11:40 ` [PULL 19/40] migration: Add tracepoints for downtime checkpoints Juan Quintela
2023-11-02 11:40 ` [PULL 20/40] migration: mode parameter Juan Quintela
2023-11-02 11:40 ` [PULL 21/40] migration: per-mode blockers Juan Quintela
2023-11-09 17:10   ` Peter Maydell
2023-11-09 17:24     ` Steven Sistare
2023-11-09 17:27       ` Peter Maydell
2023-11-02 11:40 ` [PULL 22/40] cpr: relax blockdev migration blockers Juan Quintela
2023-11-02 11:40 ` [PULL 23/40] cpr: relax vhost " Juan Quintela
2023-11-02 11:40 ` [PULL 24/40] cpr: reboot mode Juan Quintela
2023-11-02 11:40 ` [PULL 25/40] tests/qtest: migration: add reboot mode test Juan Quintela
2023-11-15 19:32   ` Steven Sistare
2023-11-02 11:40 ` [PULL 26/40] migration: Refactor error handling in source return path Juan Quintela
2023-11-02 11:40 ` Juan Quintela [this message]
2023-11-02 11:40 ` [PULL 28/40] tests/migration-test: Add a test for postcopy hangs during RECOVER Juan Quintela
2023-11-02 11:40 ` [PULL 29/40] migration: Change ram_dirty_bitmap_reload() retval to bool Juan Quintela
2023-11-02 11:40 ` [PULL 30/40] migration: New QAPI type 'MigrateAddress' Juan Quintela
2023-11-02 11:40 ` [PULL 31/40] migration: convert migration 'uri' into 'MigrateAddress' Juan Quintela
2023-11-02 11:40 ` [PULL 32/40] migration: convert socket backend to accept MigrateAddress Juan Quintela
2023-11-02 11:40 ` [PULL 33/40] migration: convert rdma " Juan Quintela
2023-11-02 11:40 ` [PULL 34/40] migration: convert exec " Juan Quintela
2023-11-02 11:40 ` [PULL 35/40] migration: Convert the file backend to the new QAPI syntax Juan Quintela
2023-11-02 11:40 ` [PULL 36/40] migration: New migrate and migrate-incoming argument 'channels' Juan Quintela
2023-11-02 11:40 ` [PULL 37/40] migration: modify migration_channels_and_uri_compatible() for new QAPI syntax Juan Quintela
2023-11-02 11:40 ` [PULL 38/40] migration: Implement MigrateChannelList to qmp migration flow Juan Quintela
2023-11-06 13:57   ` Peter Maydell
2023-11-06 14:27     ` Juan Quintela
2023-11-02 11:40 ` [PULL 39/40] migration: Implement MigrateChannelList to hmp " Juan Quintela
2023-11-02 11:40 ` [PULL 40/40] migration: modify test_multifd_tcp_none() to use new QAPI syntax Juan Quintela
2023-11-03  3:23 ` [PULL 00/40] Migration 20231102 patches Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231102114054.44360-28-quintela@redhat.com \
    --to=quintela@redhat.com \
    --cc=armbru@redhat.com \
    --cc=berrange@redhat.com \
    --cc=borntraeger@linux.ibm.com \
    --cc=clg@kaod.org \
    --cc=cminyard@mvista.com \
    --cc=codyprime@gmail.com \
    --cc=danielhb413@gmail.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=david@redhat.com \
    --cc=den@openvz.org \
    --cc=eblake@redhat.com \
    --cc=eduardo@habkost.net \
    --cc=fam@euphon.net \
    --cc=farman@linux.ibm.com \
    --cc=farosas@suse.de \
    --cc=harshpb@linux.ibm.com \
    --cc=hreitz@redhat.com \
    --cc=iii@linux.ibm.com \
    --cc=jasowang@redhat.com \
    --cc=jsnow@redhat.com \
    --cc=kraxel@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=leobras@redhat.com \
    --cc=lizhijian@fujitsu.com \
    --cc=lvivier@redhat.com \
    --cc=marcandre.lureau@redhat.com \
    --cc=marcel.apfelbaum@gmail.com \
    --cc=mark.cave-ayland@ilande.co.uk \
    --cc=minyard@acm.org \
    --cc=mst@redhat.com \
    --cc=npiggin@gmail.com \
    --cc=pasic@linux.ibm.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=peterx@redhat.com \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    --cc=qemu-s390x@nongnu.org \
    --cc=richard.henderson@linaro.org \
    --cc=samuel.thibault@ens-lyon.org \
    --cc=stefanb@linux.vnet.ibm.com \
    --cc=stefanha@redhat.com \
    --cc=sw@weilnetz.de \
    --cc=thuth@redhat.com \
    --cc=xiaohli@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).