qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Cédric Le Goater" <clg@redhat.com>
To: peterx@redhat.com, qemu-devel@nongnu.org
Cc: "Michael S . Tsirkin" <mst@redhat.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	Jason Wang <jasowang@redhat.com>, Bandan Das <bdas@redhat.com>,
	Prasad Pandit <ppandit@redhat.com>,
	Fabiano Rosas <farosas@suse.de>
Subject: Re: [PATCH 08/10] docs/migration: Organize "Postcopy" page
Date: Tue, 9 Jan 2024 08:20:09 +0100	[thread overview]
Message-ID: <485eca29-8a73-4fc5-83dc-5f8971b8ab0c@redhat.com> (raw)
In-Reply-To: <20240109064628.595453-9-peterx@redhat.com>

On 1/9/24 07:46, peterx@redhat.com wrote:
> From: Peter Xu <peterx@redhat.com>
> 
> Reorganize the page, moving things around, and add a few
> headlines ("Postcopy internals", "Postcopy features") to cover sub-areas.
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>   docs/devel/migration/postcopy.rst | 159 ++++++++++++++++--------------
>   1 file changed, 84 insertions(+), 75 deletions(-)
> 
> diff --git a/docs/devel/migration/postcopy.rst b/docs/devel/migration/postcopy.rst
> index d60eec06ab..6c51e96d79 100644
> --- a/docs/devel/migration/postcopy.rst
> +++ b/docs/devel/migration/postcopy.rst
> @@ -1,6 +1,9 @@
> +========
>   Postcopy
>   ========
>   
> +.. contents::
> +
>   'Postcopy' migration is a way to deal with migrations that refuse to converge

The quote character is used in a few places to emphasize words
which should be reworked. The rest looks good, so


Reviewed-by: Cédric Le Goater <clg@redhat.com>

Thanks,

C.



>   (or take too long to converge) its plus side is that there is an upper bound on
>   the amount of migration traffic and time it takes, the down side is that during
> @@ -14,7 +17,7 @@ Postcopy can be combined with precopy (i.e. normal migration) so that if precopy
>   doesn't finish in a given time the switch is made to postcopy.
>   
>   Enabling postcopy
> ------------------
> +=================
>   
>   To enable postcopy, issue this command on the monitor (both source and
>   destination) prior to the start of migration:
> @@ -49,8 +52,71 @@ time per vCPU.
>     ``migrate_set_parameter`` is ignored (to avoid delaying requested pages that
>     the destination is waiting for).
>   
> -Postcopy device transfer
> -------------------------
> +Postcopy internals
> +==================
> +
> +State machine
> +-------------
> +
> +Postcopy moves through a series of states (see postcopy_state) from
> +ADVISE->DISCARD->LISTEN->RUNNING->END
> +
> + - Advise
> +
> +    Set at the start of migration if postcopy is enabled, even
> +    if it hasn't had the start command; here the destination
> +    checks that its OS has the support needed for postcopy, and performs
> +    setup to ensure the RAM mappings are suitable for later postcopy.
> +    The destination will fail early in migration at this point if the
> +    required OS support is not present.
> +    (Triggered by reception of POSTCOPY_ADVISE command)
> +
> + - Discard
> +
> +    Entered on receipt of the first 'discard' command; prior to
> +    the first Discard being performed, hugepages are switched off
> +    (using madvise) to ensure that no new huge pages are created
> +    during the postcopy phase, and to cause any huge pages that
> +    have discards on them to be broken.
> +
> + - Listen
> +
> +    The first command in the package, POSTCOPY_LISTEN, switches
> +    the destination state to Listen, and starts a new thread
> +    (the 'listen thread') which takes over the job of receiving
> +    pages off the migration stream, while the main thread carries
> +    on processing the blob.  With this thread able to process page
> +    reception, the destination now 'sensitises' the RAM to detect
> +    any access to missing pages (on Linux using the 'userfault'
> +    system).
> +
> + - Running
> +
> +    POSTCOPY_RUN causes the destination to synchronise all
> +    state and start the CPUs and IO devices running.  The main
> +    thread now finishes processing the migration package and
> +    now carries on as it would for normal precopy migration
> +    (although it can't do the cleanup it would do as it
> +    finishes a normal migration).
> +
> + - Paused
> +
> +    Postcopy can run into a paused state (normally on both sides when
> +    happens), where all threads will be temporarily halted mostly due to
> +    network errors.  When reaching paused state, migration will make sure
> +    the qemu binary on both sides maintain the data without corrupting
> +    the VM.  To continue the migration, the admin needs to fix the
> +    migration channel using the QMP command 'migrate-recover' on the
> +    destination node, then resume the migration using QMP command 'migrate'
> +    again on source node, with resume=true flag set.
> +
> + - End
> +
> +    The listen thread can now quit, and perform the cleanup of migration
> +    state, the migration is now complete.
> +
> +Device transfer
> +---------------
>   
>   Loading of device data may cause the device emulation to access guest RAM
>   that may trigger faults that have to be resolved by the source, as such
> @@ -130,7 +196,20 @@ processing.
>      is no longer used by migration, while the listen thread carries on servicing
>      page data until the end of migration.
>   
> -Postcopy Recovery
> +Source side page bitmap
> +-----------------------
> +
> +The 'migration bitmap' in postcopy is basically the same as in the precopy,
> +where each of the bit to indicate that page is 'dirty' - i.e. needs
> +sending.  During the precopy phase this is updated as the CPU dirties
> +pages, however during postcopy the CPUs are stopped and nothing should
> +dirty anything any more. Instead, dirty bits are cleared when the relevant
> +pages are sent during postcopy.
> +
> +Postcopy features
> +=================
> +
> +Postcopy recovery
>   -----------------
>   
>   Comparing to precopy, postcopy is special on error handlings.  When any
> @@ -166,76 +245,6 @@ configurations of the guest.  For example, when with async page fault
>   enabled, logically the guest can proactively schedule out the threads
>   accessing missing pages.
>   
> -Postcopy states
> ----------------
> -
> -Postcopy moves through a series of states (see postcopy_state) from
> -ADVISE->DISCARD->LISTEN->RUNNING->END
> -
> - - Advise
> -
> -    Set at the start of migration if postcopy is enabled, even
> -    if it hasn't had the start command; here the destination
> -    checks that its OS has the support needed for postcopy, and performs
> -    setup to ensure the RAM mappings are suitable for later postcopy.
> -    The destination will fail early in migration at this point if the
> -    required OS support is not present.
> -    (Triggered by reception of POSTCOPY_ADVISE command)
> -
> - - Discard
> -
> -    Entered on receipt of the first 'discard' command; prior to
> -    the first Discard being performed, hugepages are switched off
> -    (using madvise) to ensure that no new huge pages are created
> -    during the postcopy phase, and to cause any huge pages that
> -    have discards on them to be broken.
> -
> - - Listen
> -
> -    The first command in the package, POSTCOPY_LISTEN, switches
> -    the destination state to Listen, and starts a new thread
> -    (the 'listen thread') which takes over the job of receiving
> -    pages off the migration stream, while the main thread carries
> -    on processing the blob.  With this thread able to process page
> -    reception, the destination now 'sensitises' the RAM to detect
> -    any access to missing pages (on Linux using the 'userfault'
> -    system).
> -
> - - Running
> -
> -    POSTCOPY_RUN causes the destination to synchronise all
> -    state and start the CPUs and IO devices running.  The main
> -    thread now finishes processing the migration package and
> -    now carries on as it would for normal precopy migration
> -    (although it can't do the cleanup it would do as it
> -    finishes a normal migration).
> -
> - - Paused
> -
> -    Postcopy can run into a paused state (normally on both sides when
> -    happens), where all threads will be temporarily halted mostly due to
> -    network errors.  When reaching paused state, migration will make sure
> -    the qemu binary on both sides maintain the data without corrupting
> -    the VM.  To continue the migration, the admin needs to fix the
> -    migration channel using the QMP command 'migrate-recover' on the
> -    destination node, then resume the migration using QMP command 'migrate'
> -    again on source node, with resume=true flag set.
> -
> - - End
> -
> -    The listen thread can now quit, and perform the cleanup of migration
> -    state, the migration is now complete.
> -
> -Source side page map
> ---------------------
> -
> -The 'migration bitmap' in postcopy is basically the same as in the precopy,
> -where each of the bit to indicate that page is 'dirty' - i.e. needs
> -sending.  During the precopy phase this is updated as the CPU dirties
> -pages, however during postcopy the CPUs are stopped and nothing should
> -dirty anything any more. Instead, dirty bits are cleared when the relevant
> -pages are sent during postcopy.
> -
>   Postcopy with hugepages
>   -----------------------
>   
> @@ -293,7 +302,7 @@ Retro-fitting postcopy to existing clients is possible:
>        guest memory access is made while holding a lock then all other
>        threads waiting for that lock will also be blocked.
>   
> -Postcopy Preemption Mode
> +Postcopy preemption mode
>   ------------------------
>   
>   Postcopy preempt is a new capability introduced in 8.0 QEMU release, it



  reply	other threads:[~2024-01-09  7:20 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-09  6:46 [PATCH 00/10] docs/migration: Reorganize migration documentations peterx
2024-01-09  6:46 ` [PATCH 01/10] docs/migration: Create migration/ directory peterx
2024-01-09  6:52   ` Cédric Le Goater
2024-01-09  6:46 ` [PATCH 02/10] docs/migration: Create index page peterx
2024-01-09  6:53   ` Cédric Le Goater
2024-01-09  6:46 ` [PATCH 03/10] docs/migration: Convert virtio.txt into rST peterx
2024-01-09  7:02   ` Cédric Le Goater
2024-01-09  6:46 ` [PATCH 04/10] docs/migration: Split "Backwards compatibility" separately peterx
2024-01-09  7:03   ` Cédric Le Goater
2024-01-09  6:46 ` [PATCH 05/10] docs/migration: Split "Debugging" and "Firmware" peterx
2024-01-09  7:04   ` Cédric Le Goater
2024-01-09 17:03   ` Fabiano Rosas
2024-01-10  2:10     ` Peter Xu
2024-01-09  6:46 ` [PATCH 06/10] docs/migration: Split "Postcopy" peterx
2024-01-09  7:05   ` Cédric Le Goater
2024-01-09  6:46 ` [PATCH 07/10] docs/migration: Split "dirty limit" peterx
2024-01-09  7:06   ` Cédric Le Goater
2024-01-09  6:46 ` [PATCH 08/10] docs/migration: Organize "Postcopy" page peterx
2024-01-09  7:20   ` Cédric Le Goater [this message]
2024-01-09  6:46 ` [PATCH 09/10] docs/migration: Further move vfio to be feature of migration peterx
2024-01-09  7:20   ` Cédric Le Goater
2024-01-09  6:46 ` [PATCH 10/10] docs/migration: Further move virtio " peterx
2024-01-09  7:20   ` Cédric Le Goater
2024-01-09 10:49 ` [PATCH 00/10] docs/migration: Reorganize migration documentations Peter Xu
2024-01-09 13:21   ` Cédric Le Goater
2024-01-10  2:37     ` Peter Xu
2024-01-10 15:21       ` Cédric Le Goater
2024-01-11  2:42         ` Peter Xu
2024-01-11  6:20 ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=485eca29-8a73-4fc5-83dc-5f8971b8ab0c@redhat.com \
    --to=clg@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=bdas@redhat.com \
    --cc=farosas@suse.de \
    --cc=jasowang@redhat.com \
    --cc=mst@redhat.com \
    --cc=peterx@redhat.com \
    --cc=ppandit@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).