qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Fabiano Rosas <farosas@suse.de>
To: Peter Xu <peterx@redhat.com>, qemu-devel@nongnu.org
Cc: Thomas Huth <thuth@redhat.com>,
	Markus Armbruster <armbru@redhat.com>,
	Laurent Vivier <lvivier@redhat.com>,
	Eric Blake <eblake@redhat.com>,
	Prasad Pandit <ppandit@redhat.com>,
	peterx@redhat.com, Jiri Denemark <jdenemar@redhat.com>,
	Bandan Das <bdas@redhat.com>
Subject: Re: [PATCH v2 06/10] migration/docs: Update postcopy recover session for SETUP phase
Date: Mon, 17 Jun 2024 16:47:56 -0300	[thread overview]
Message-ID: <87frtbbbbn.fsf@suse.de> (raw)
In-Reply-To: <20240617181534.1425179-7-peterx@redhat.com>

Peter Xu <peterx@redhat.com> writes:

> Firstly, the "Paused" state was added in the wrong place before. The state
> machine section was describing PostcopyState, rather than MigrationStatus.
> Drop the Paused state descriptions.
>
> Then in the postcopy recover session, add more information on the state
> machine for MigrationStatus in the lines.  Add the new RECOVER_SETUP phase.
>
> Signed-off-by: Peter Xu <peterx@redhat.com>

Reviewed-by: Fabiano Rosas <farosas@suse.de>

> ---
>  docs/devel/migration/postcopy.rst | 31 ++++++++++++++++---------------
>  1 file changed, 16 insertions(+), 15 deletions(-)
>
> diff --git a/docs/devel/migration/postcopy.rst b/docs/devel/migration/postcopy.rst
> index 6c51e96d79..a15594e11f 100644
> --- a/docs/devel/migration/postcopy.rst
> +++ b/docs/devel/migration/postcopy.rst
> @@ -99,17 +99,6 @@ ADVISE->DISCARD->LISTEN->RUNNING->END
>      (although it can't do the cleanup it would do as it
>      finishes a normal migration).
>  
> - - Paused
> -
> -    Postcopy can run into a paused state (normally on both sides when
> -    happens), where all threads will be temporarily halted mostly due to
> -    network errors.  When reaching paused state, migration will make sure
> -    the qemu binary on both sides maintain the data without corrupting
> -    the VM.  To continue the migration, the admin needs to fix the
> -    migration channel using the QMP command 'migrate-recover' on the
> -    destination node, then resume the migration using QMP command 'migrate'
> -    again on source node, with resume=true flag set.
> -
>   - End
>  
>      The listen thread can now quit, and perform the cleanup of migration
> @@ -221,7 +210,8 @@ paused postcopy migration.
>  
>  The recovery phase normally contains a few steps:
>  
> -  - When network issue occurs, both QEMU will go into PAUSED state
> +  - When network issue occurs, both QEMU will go into **POSTCOPY_PAUSED**
> +    migration state.
>  
>    - When the network is recovered (or a new network is provided), the admin
>      can setup the new channel for migration using QMP command
> @@ -229,9 +219,20 @@ The recovery phase normally contains a few steps:
>  
>    - On source host, the admin can continue the interrupted postcopy
>      migration using QMP command 'migrate' with resume=true flag set.
> -
> -  - After the connection is re-established, QEMU will continue the postcopy
> -    migration on both sides.
> +    Source QEMU will go into **POSTCOPY_RECOVER_SETUP** state trying to
> +    re-establish the channels.
> +
> +  - When both sides of QEMU successfully reconnects using a new or fixed up

s/reconnects/reconnect

I can touch it up when queueing

> +    channel, they will go into **POSTCOPY_RECOVER** state, some handshake
> +    procedure will be needed to properly synchronize the VM states between
> +    the two QEMUs to continue the postcopy migration.  For example, there
> +    can be pages sent right during the window when the network is
> +    interrupted, then the handshake will guarantee pages lost in-flight
> +    will be resent again.
> +
> +  - After a proper handshake synchronization, QEMU will continue the
> +    postcopy migration on both sides and go back to **POSTCOPY_ACTIVE**
> +    state.  Postcopy migration will continue.
>  
>  During a paused postcopy migration, the VM can logically still continue
>  running, and it will not be impacted from any page access to pages that


  reply	other threads:[~2024-06-17 19:48 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-17 18:15 [PATCH v2 00/10] migration: New postcopy state, and some cleanups Peter Xu
2024-06-17 18:15 ` [PATCH v2 01/10] migration/multifd: Avoid the final FLUSH in complete() Peter Xu
2024-06-17 18:15 ` [PATCH v2 02/10] migration: Rename thread debug names Peter Xu
2024-06-19  1:05   ` Zhijian Li (Fujitsu) via
2024-06-17 18:15 ` [PATCH v2 03/10] migration: Use MigrationStatus instead of int Peter Xu
2024-06-17 19:38   ` Fabiano Rosas
2024-06-17 18:15 ` [PATCH v2 04/10] migration: Cleanup incoming migration setup state change Peter Xu
2024-06-17 19:41   ` Fabiano Rosas
2024-06-17 18:15 ` [PATCH v2 05/10] migration/postcopy: Add postcopy-recover-setup phase Peter Xu
2024-06-17 19:45   ` Fabiano Rosas
2024-06-17 18:15 ` [PATCH v2 06/10] migration/docs: Update postcopy recover session for SETUP phase Peter Xu
2024-06-17 19:47   ` Fabiano Rosas [this message]
2024-06-17 18:15 ` [PATCH v2 07/10] tests/migration-tests: Drop most WIN32 ifdefs for postcopy failure tests Peter Xu
2024-06-17 19:49   ` Fabiano Rosas
2024-06-17 18:15 ` [PATCH v2 08/10] tests/migration-tests: Always enable migration events Peter Xu
2024-06-17 19:51   ` Fabiano Rosas
2024-06-17 21:23     ` Peter Xu
2024-06-19 20:39       ` Peter Xu
2024-06-17 18:15 ` [PATCH v2 09/10] tests/migration-tests: Verify postcopy-recover-setup status Peter Xu
2024-06-17 19:53   ` Fabiano Rosas
2024-06-17 18:15 ` [PATCH v2 10/10] tests/migration-tests: Cover postcopy failure on reconnect Peter Xu
2024-06-17 20:07   ` Fabiano Rosas
2024-06-17 19:34 ` [PATCH v2 00/10] migration: New postcopy state, and some cleanups Peter Xu
2024-06-17 20:12   ` Fabiano Rosas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87frtbbbbn.fsf@suse.de \
    --to=farosas@suse.de \
    --cc=armbru@redhat.com \
    --cc=bdas@redhat.com \
    --cc=eblake@redhat.com \
    --cc=jdenemar@redhat.com \
    --cc=lvivier@redhat.com \
    --cc=peterx@redhat.com \
    --cc=ppandit@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=thuth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).